DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCHv2 0/5] ACL library
@ 2014-05-28 19:26 Konstantin Ananyev
  2014-05-28 19:26 ` [dpdk-dev] [PATCHv2 1/5] acl: Add ACL library (librte_acl) into DPDK Konstantin Ananyev
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: Konstantin Ananyev @ 2014-05-28 19:26 UTC (permalink / raw)
  To: dev, dev

The ACL library is used to perform an N-tuple search over a set of rules
with multiple categories and find the best match (highest priority)
for each category.    
This code was previously released under a proprietary license,
but is now being released under a BSD license to allow its
integration with the rest of the Intel DPDK codebase.

Note that these patch series require other patch:
"lpm: Introduce rte_lpm_lookupx4" be already installed.

This patch series contains the following items:
1) librte_acl.
2) UT changes reflect latest changes in rte_acl library.
3) teat-acl: usage example and main test application for the ACL library.
   Provides IPv4/IPv6 5-tuple classification.
4) l3fwd-acl: demonstrates the use of the ACL library in the DPDK application
   to implement packet classification and L3 forwarding.
5) add doxygen configuration and start page

v2 fixes:
* Fixed several checkpatch.pl issues
* Added doxygen related changes

 app/Makefile                         |    1 +
 app/test-acl/Makefile                |   45 +
 app/test-acl/main.c                  | 1029 +++++++++++++++++
 app/test-acl/main.h                  |   50 +
 app/test/test_acl.c                  |  128 ++-
 config/common_linuxapp               |    6 +
 doc/doxy-api-index.md                |    3 +-
 doc/doxy-api.conf                    |    3 +-
 examples/Makefile                    |    1 +
 examples/l3fwd-acl/Makefile          |   56 +
 examples/l3fwd-acl/main.c            | 2048 ++++++++++++++++++++++++++++++++++
 examples/l3fwd-acl/main.h            |   45 +
 lib/librte_acl/Makefile              |   60 +
 lib/librte_acl/acl.h                 |  182 +++
 lib/librte_acl/acl_bld.c             | 2001 +++++++++++++++++++++++++++++++++
 lib/librte_acl/acl_gen.c             |  473 ++++++++
 lib/librte_acl/acl_run.c             |  927 +++++++++++++++
 lib/librte_acl/acl_vect.h            |  129 +++
 lib/librte_acl/rte_acl.c             |  413 +++++++
 lib/librte_acl/rte_acl.h             |  453 ++++++++
 lib/librte_acl/rte_acl_osdep.h       |   92 ++
 lib/librte_acl/rte_acl_osdep_alone.h |  277 +++++
 lib/librte_acl/tb_mem.c              |  102 ++
 lib/librte_acl/tb_mem.h              |   73 ++
 24 files changed, 8552 insertions(+), 45 deletions(-)
 create mode 100644 app/test-acl/Makefile
 create mode 100644 app/test-acl/main.c
 create mode 100644 app/test-acl/main.h
 create mode 100644 examples/l3fwd-acl/Makefile
 create mode 100644 examples/l3fwd-acl/main.c
 create mode 100644 examples/l3fwd-acl/main.h
 create mode 100644 lib/librte_acl/Makefile
 create mode 100644 lib/librte_acl/acl.h
 create mode 100644 lib/librte_acl/acl_bld.c
 create mode 100644 lib/librte_acl/acl_gen.c
 create mode 100644 lib/librte_acl/acl_run.c
 create mode 100644 lib/librte_acl/acl_vect.h
 create mode 100644 lib/librte_acl/rte_acl.c
 create mode 100644 lib/librte_acl/rte_acl.h
 create mode 100644 lib/librte_acl/rte_acl_osdep.h
 create mode 100644 lib/librte_acl/rte_acl_osdep_alone.h
 create mode 100644 lib/librte_acl/tb_mem.c
 create mode 100644 lib/librte_acl/tb_mem.h

-- 
1.7.7.6

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [dpdk-dev] [PATCHv2 1/5] acl: Add ACL library (librte_acl) into DPDK.
  2014-05-28 19:26 [dpdk-dev] [PATCHv2 0/5] ACL library Konstantin Ananyev
@ 2014-05-28 19:26 ` Konstantin Ananyev
  2014-05-28 19:26 ` [dpdk-dev] [PATCHv2 2/5] acl: update UT to reflect latest changes in the librte_acl Konstantin Ananyev
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Konstantin Ananyev @ 2014-05-28 19:26 UTC (permalink / raw)
  To: dev, dev

The ACL library is used to perform an N-tuple search over a set of rules with
multiple categories and find the best match for each category.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 config/common_linuxapp               |    6 +
 lib/librte_acl/Makefile              |   60 +
 lib/librte_acl/acl.h                 |  182 +++
 lib/librte_acl/acl_bld.c             | 2001 ++++++++++++++++++++++++++++++++++
 lib/librte_acl/acl_gen.c             |  473 ++++++++
 lib/librte_acl/acl_run.c             |  927 ++++++++++++++++
 lib/librte_acl/acl_vect.h            |  129 +++
 lib/librte_acl/rte_acl.c             |  413 +++++++
 lib/librte_acl/rte_acl.h             |  453 ++++++++
 lib/librte_acl/rte_acl_osdep.h       |   92 ++
 lib/librte_acl/rte_acl_osdep_alone.h |  277 +++++
 lib/librte_acl/tb_mem.c              |  102 ++
 lib/librte_acl/tb_mem.h              |   73 ++
 13 files changed, 5188 insertions(+), 0 deletions(-)
 create mode 100644 lib/librte_acl/Makefile
 create mode 100644 lib/librte_acl/acl.h
 create mode 100644 lib/librte_acl/acl_bld.c
 create mode 100644 lib/librte_acl/acl_gen.c
 create mode 100644 lib/librte_acl/acl_run.c
 create mode 100644 lib/librte_acl/acl_vect.h
 create mode 100644 lib/librte_acl/rte_acl.c
 create mode 100644 lib/librte_acl/rte_acl.h
 create mode 100644 lib/librte_acl/rte_acl_osdep.h
 create mode 100644 lib/librte_acl/rte_acl_osdep_alone.h
 create mode 100644 lib/librte_acl/tb_mem.c
 create mode 100644 lib/librte_acl/tb_mem.h

diff --git a/config/common_linuxapp b/config/common_linuxapp
index 62619c6..fcfed6f 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -337,3 +337,9 @@ CONFIG_RTE_TEST_PMD_RECORD_BURST_STATS=n
 #
 CONFIG_RTE_NIC_BYPASS=n
 
+# Compile librte_acl
+#
+CONFIG_RTE_LIBRTE_ACL=y
+CONFIG_RTE_LIBRTE_ACL_DEBUG=n
+CONFIG_RTE_LIBRTE_ACL_STANDALONE=n
+
diff --git a/lib/librte_acl/Makefile b/lib/librte_acl/Makefile
new file mode 100644
index 0000000..4fe4593
--- /dev/null
+++ b/lib/librte_acl/Makefile
@@ -0,0 +1,60 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_acl.a
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
+
+# all source are stored in SRCS-y
+SRCS-$(CONFIG_RTE_LIBRTE_ACL) += tb_mem.c
+
+SRCS-$(CONFIG_RTE_LIBRTE_ACL) += rte_acl.c
+SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_bld.c
+SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_gen.c
+SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run.c
+
+# install this header file
+SYMLINK-$(CONFIG_RTE_LIBRTE_ACL)-include := rte_acl_osdep.h
+SYMLINK-$(CONFIG_RTE_LIBRTE_ACL)-include += rte_acl.h
+
+ifeq ($(CONFIG_RTE_LIBRTE_ACL_STANDALONE),y)
+# standalone build
+SYMLINK-$(CONFIG_RTE_LIBRTE_ACL)-include += rte_acl_osdep_alone.h
+else
+# this lib needs eal
+DEPDIRS-$(CONFIG_RTE_LIBRTE_ACL) += lib/librte_eal lib/librte_malloc
+endif
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_acl/acl.h b/lib/librte_acl/acl.h
new file mode 100644
index 0000000..e6d7985
--- /dev/null
+++ b/lib/librte_acl/acl.h
@@ -0,0 +1,182 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef	_ACL_H_
+#define	_ACL_H_
+
+#ifdef __cplusplus
+extern"C" {
+#endif /* __cplusplus */
+
+#define RTE_ACL_QUAD_MAX	5
+#define RTE_ACL_QUAD_SIZE	4
+#define RTE_ACL_QUAD_SINGLE	UINT64_C(0x7f7f7f7f00000000)
+
+#define RTE_ACL_SINGLE_TRIE_SIZE	2000
+
+#define RTE_ACL_DFA_MAX		UINT8_MAX
+#define RTE_ACL_DFA_SIZE	(UINT8_MAX + 1)
+
+typedef int bits_t;
+
+#define	RTE_ACL_BIT_SET_SIZE	((UINT8_MAX + 1) / (sizeof(bits_t) * CHAR_BIT))
+
+struct rte_acl_bitset {
+	bits_t             bits[RTE_ACL_BIT_SET_SIZE];
+};
+
+#define	RTE_ACL_NODE_DFA	(0 << RTE_ACL_TYPE_SHIFT)
+#define	RTE_ACL_NODE_SINGLE	(1U << RTE_ACL_TYPE_SHIFT)
+#define	RTE_ACL_NODE_QEXACT	(2U << RTE_ACL_TYPE_SHIFT)
+#define	RTE_ACL_NODE_QRANGE	(3U << RTE_ACL_TYPE_SHIFT)
+#define	RTE_ACL_NODE_MATCH	(4U << RTE_ACL_TYPE_SHIFT)
+#define	RTE_ACL_NODE_TYPE	(7U << RTE_ACL_TYPE_SHIFT)
+#define	RTE_ACL_NODE_UNDEFINED	UINT32_MAX
+
+/*
+ * Structure of a node is a set of ptrs and each ptr has a bit map
+ * of values associated with this transition.
+ */
+struct rte_acl_ptr_set {
+	struct rte_acl_bitset values;	/* input values associated with ptr */
+	struct rte_acl_node  *ptr;	/* transition to next node */
+};
+
+struct rte_acl_classifier_results {
+	int results[RTE_ACL_MAX_CATEGORIES];
+};
+
+struct rte_acl_match_results {
+	uint32_t results[RTE_ACL_MAX_CATEGORIES];
+	int32_t priority[RTE_ACL_MAX_CATEGORIES];
+};
+
+struct rte_acl_node {
+	uint64_t node_index;  /* index for this node */
+	uint32_t level;       /* level 0-n in the trie */
+	uint32_t ref_count;   /* ref count for this node */
+	struct rte_acl_bitset  values;
+	/* set of all values that map to another node
+	 * (union of bits in each transition.
+	 */
+	uint32_t                num_ptrs; /* number of ptr_set in use */
+	uint32_t                max_ptrs; /* number of allocated ptr_set */
+	uint32_t                min_add;  /* number of ptr_set per allocation */
+	struct rte_acl_ptr_set *ptrs;     /* transitions array for this node */
+	int32_t                 match_flag;
+	int32_t                 match_index; /* index to match data */
+	uint32_t                node_type;
+	int32_t                 fanout;
+	/* number of ranges (transitions w/ consecutive bits) */
+	int32_t                 id;
+	struct rte_acl_match_results *mrt; /* only valid when match_flag != 0 */
+	char                         transitions[RTE_ACL_QUAD_SIZE];
+	/* boundaries for ranged node */
+	struct rte_acl_node     *next;
+	/* free list link or pointer to duplicate node during merge */
+	struct rte_acl_node     *prev;
+	/* points to node from which this node was duplicated */
+
+	uint32_t                subtree_id;
+	uint32_t                subtree_ref_count;
+
+};
+enum {
+	RTE_ACL_SUBTREE_NODE = 0x80000000
+};
+
+/*
+ * Types of tries used to generate runtime structure(s)
+ */
+enum {
+	RTE_ACL_FULL_TRIE = 0,
+	RTE_ACL_NOSRC_TRIE = 1,
+	RTE_ACL_NODST_TRIE = 2,
+	RTE_ACL_NOPORTS_TRIE = 4,
+	RTE_ACL_NOVLAN_TRIE = 8,
+	RTE_ACL_UNUSED_TRIE = 0x80000000
+};
+
+
+/** MAX number of tries per one ACL context.*/
+#define RTE_ACL_MAX_TRIES	8
+
+/** Max number of characters in PM name.*/
+#define RTE_ACL_NAMESIZE	32
+
+
+struct rte_acl_trie {
+	uint32_t        type;
+	uint32_t        count;
+	int32_t         smallest;  /* smallest rule in this trie */
+	uint32_t        root_index;
+	const uint32_t *data_index;
+	uint32_t        num_data_indexes;
+};
+
+struct rte_acl_bld_trie {
+	struct rte_acl_node *trie;
+};
+
+struct rte_acl_ctx {
+	TAILQ_ENTRY(rte_acl_ctx) next;    /**< Next in list. */
+	char                name[RTE_ACL_NAMESIZE];
+	/** Name of the ACL context. */
+	int32_t             socket_id;
+	/** Socket ID to allocate memory from. */
+	void               *rules;
+	uint32_t            max_rules;
+	uint32_t            rule_sz;
+	uint32_t            num_rules;
+	uint32_t            num_categories;
+	uint32_t            num_tries;
+	uint32_t            match_index;
+	uint64_t            no_match;
+	uint64_t            idle;
+	uint64_t           *trans_table;
+	uint32_t           *data_indexes;
+	struct rte_acl_trie trie[RTE_ACL_MAX_TRIES];
+	void               *mem;
+	size_t              mem_sz;
+	struct rte_acl_config config; /* copy of build config. */
+};
+
+int rte_acl_gen(struct rte_acl_ctx *ctx, struct rte_acl_trie *trie,
+	struct rte_acl_bld_trie *node_bld_trie, uint32_t num_tries,
+	uint32_t num_categories, uint32_t data_index_sz, int match_num);
+
+#ifdef __cplusplus
+}
+#endif /* __cplusplus */
+
+#endif /* _ACL_H_ */
diff --git a/lib/librte_acl/acl_bld.c b/lib/librte_acl/acl_bld.c
new file mode 100644
index 0000000..66dd847
--- /dev/null
+++ b/lib/librte_acl/acl_bld.c
@@ -0,0 +1,2001 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <rte_acl.h>
+#include "tb_mem.h"
+#include "acl.h"
+
+#define	ACL_POOL_ALIGN		8
+#define	ACL_POOL_ALLOC_MIN	0x800000
+
+/* number of pointers per alloc */
+#define ACL_PTR_ALLOC	32
+
+/* variable for dividing rule sets */
+#define NODE_MAX	2500
+#define NODE_PERCENTAGE	(0.40)
+#define RULE_PERCENTAGE	(0.40)
+
+/* TALLY are statistics per field */
+enum {
+	TALLY_0 = 0,        /* number of rules that are 0% or more wild. */
+	TALLY_25,	    /* number of rules that are 25% or more wild. */
+	TALLY_50,
+	TALLY_75,
+	TALLY_100,
+	TALLY_DEACTIVATED, /* deactivated fields (100% wild in all rules). */
+	TALLY_DEPTH,
+	/* number of rules that are 100% wild for this field and higher. */
+	TALLY_NUM
+};
+
+static const uint32_t wild_limits[TALLY_DEACTIVATED] = {0, 25, 50, 75, 100};
+
+enum {
+	ACL_INTERSECT_NONE = 0,
+	ACL_INTERSECT_A = 1,    /* set A is a superset of A and B intersect */
+	ACL_INTERSECT_B = 2,    /* set B is a superset of A and B intersect */
+	ACL_INTERSECT = 4,	/* sets A and B intersect */
+};
+
+enum {
+	ACL_PRIORITY_EQUAL = 0,
+	ACL_PRIORITY_NODE_A = 1,
+	ACL_PRIORITY_NODE_B = 2,
+	ACL_PRIORITY_MIXED = 3
+};
+
+
+struct acl_mem_block {
+	uint32_t block_size;
+	void     *mem_ptr;
+};
+
+#define	MEM_BLOCK_NUM	16
+
+/* Single ACL rule, build representation.*/
+struct rte_acl_build_rule {
+	struct rte_acl_build_rule   *next;
+	struct rte_acl_config       *config;
+	/**< configuration for each field in the rule. */
+	const struct rte_acl_rule   *f;
+	uint32_t                    *wildness;
+};
+
+/* Context for build phase */
+struct acl_build_context {
+	const struct rte_acl_ctx *acx;
+	struct rte_acl_build_rule *build_rules;
+	struct rte_acl_config     cfg;
+	uint32_t                  node;
+	uint32_t                  num_nodes;
+	uint32_t                  category_mask;
+	uint32_t                  num_rules;
+	uint32_t                  node_id;
+	uint32_t                  src_mask;
+	uint32_t                  num_build_rules;
+	uint32_t                  num_tries;
+	struct tb_mem_pool        pool;
+	struct rte_acl_trie       tries[RTE_ACL_MAX_TRIES];
+	struct rte_acl_bld_trie   bld_tries[RTE_ACL_MAX_TRIES];
+	uint32_t            data_indexes[RTE_ACL_MAX_TRIES][RTE_ACL_MAX_FIELDS];
+
+	/* memory free lists for nodes and blocks used for node ptrs */
+	struct acl_mem_block      blocks[MEM_BLOCK_NUM];
+	struct rte_acl_node       *node_free_list;
+};
+
+static int acl_merge_trie(struct acl_build_context *context,
+	struct rte_acl_node *node_a, struct rte_acl_node *node_b,
+	uint32_t level, uint32_t subtree_id, struct rte_acl_node **node_c);
+
+static int acl_merge(struct acl_build_context *context,
+	struct rte_acl_node *node_a, struct rte_acl_node *node_b,
+	int move, int a_subset, int level);
+
+static void
+acl_deref_ptr(struct acl_build_context *context,
+	struct rte_acl_node *node, int index);
+
+static void *
+acl_build_alloc(struct acl_build_context *context, size_t n, size_t s)
+{
+	uint32_t m;
+	void *p;
+	size_t alloc_size = n * s;
+
+	/*
+	 * look for memory in free lists
+	 */
+	for (m = 0; m < RTE_DIM(context->blocks); m++) {
+		if (context->blocks[m].block_size ==
+		   alloc_size && context->blocks[m].mem_ptr != NULL) {
+			p = context->blocks[m].mem_ptr;
+			context->blocks[m].mem_ptr = *((void **)p);
+			memset(p, 0, alloc_size);
+			return (p);
+		}
+	}
+
+	/*
+	 * return allocation from memory pool
+	 */
+	p = tb_alloc(&context->pool, alloc_size);
+	return (p);
+}
+
+/*
+ * Free memory blocks (kept in context for reuse).
+ */
+static void
+acl_build_free(struct acl_build_context *context, size_t s, void *p)
+{
+	uint32_t n;
+
+	for (n = 0; n < RTE_DIM(context->blocks); n++) {
+		if (context->blocks[n].block_size == s) {
+			*((void **)p) = context->blocks[n].mem_ptr;
+			context->blocks[n].mem_ptr = p;
+			return;
+		}
+	}
+	for (n = 0; n < RTE_DIM(context->blocks); n++) {
+		if (context->blocks[n].block_size == 0) {
+			context->blocks[n].block_size = s;
+			*((void **)p) = NULL;
+			context->blocks[n].mem_ptr = p;
+			return;
+		}
+	}
+}
+
+/*
+ * Allocate and initialize a new node.
+ */
+static struct rte_acl_node *
+acl_alloc_node(struct acl_build_context *context, int level)
+{
+	struct rte_acl_node *node;
+
+	if (context->node_free_list != NULL) {
+		node = context->node_free_list;
+		context->node_free_list = node->next;
+		memset(node, 0, sizeof(struct rte_acl_node));
+	} else {
+		node = acl_build_alloc(context, sizeof(struct rte_acl_node), 1);
+	}
+
+	if (node != NULL) {
+		node->num_ptrs = 0;
+		node->level = level;
+		node->node_type = RTE_ACL_NODE_UNDEFINED;
+		node->node_index = RTE_ACL_NODE_UNDEFINED;
+		context->num_nodes++;
+		node->id = context->node_id++;
+	}
+	return (node);
+}
+
+/*
+ * Dereference all nodes to which this node points
+ */
+static void
+acl_free_node(struct acl_build_context *context,
+	struct rte_acl_node *node)
+{
+	uint32_t n;
+
+	if (node->prev != NULL)
+		node->prev->next = NULL;
+	for (n = 0; n < node->num_ptrs; n++)
+		acl_deref_ptr(context, node, n);
+
+	/* free mrt if this is a match node */
+	if (node->mrt != NULL) {
+		acl_build_free(context, sizeof(struct rte_acl_match_results),
+			node->mrt);
+		node->mrt = NULL;
+	}
+
+	/* free transitions to other nodes */
+	if (node->ptrs != NULL) {
+		acl_build_free(context,
+			node->max_ptrs * sizeof(struct rte_acl_ptr_set),
+			node->ptrs);
+		node->ptrs = NULL;
+	}
+
+	/* put it on the free list */
+	context->num_nodes--;
+	node->next = context->node_free_list;
+	context->node_free_list = node;
+}
+
+
+/*
+ * Include src bitset in dst bitset
+ */
+static void
+acl_include(struct rte_acl_bitset *dst, struct rte_acl_bitset *src, bits_t mask)
+{
+	uint32_t n;
+
+	for (n = 0; n < RTE_ACL_BIT_SET_SIZE; n++)
+		dst->bits[n] = (dst->bits[n] & mask) | src->bits[n];
+}
+
+/*
+ * Set dst to bits of src1 that are not in src2
+ */
+static int
+acl_exclude(struct rte_acl_bitset *dst,
+	struct rte_acl_bitset *src1,
+	struct rte_acl_bitset *src2)
+{
+	uint32_t n;
+	bits_t all_bits = 0;
+
+	for (n = 0; n < RTE_ACL_BIT_SET_SIZE; n++) {
+		dst->bits[n] = src1->bits[n] & ~src2->bits[n];
+		all_bits |= dst->bits[n];
+	}
+	return (all_bits != 0);
+}
+
+/*
+ * Add a pointer (ptr) to a node.
+ */
+static int
+acl_add_ptr(struct acl_build_context *context,
+	struct rte_acl_node *node,
+	struct rte_acl_node *ptr,
+	struct rte_acl_bitset *bits)
+{
+	uint32_t n, num_ptrs;
+	struct rte_acl_ptr_set *ptrs = NULL;
+
+	/*
+	 * If there's already a pointer to the same node, just add to the bitset
+	 */
+	for (n = 0; n < node->num_ptrs; n++) {
+		if (node->ptrs[n].ptr != NULL) {
+			if (node->ptrs[n].ptr == ptr) {
+				acl_include(&node->ptrs[n].values, bits, -1);
+				acl_include(&node->values, bits, -1);
+				return (0);
+			}
+		}
+	}
+
+	/* if there's no room for another pointer, make room */
+	if (node->num_ptrs >= node->max_ptrs) {
+		/* add room for more pointers */
+		num_ptrs = node->max_ptrs + ACL_PTR_ALLOC;
+		if ((ptrs = acl_build_alloc(context, num_ptrs,
+				sizeof(*ptrs))) == NULL)
+			return (-ENOMEM);
+
+		/* copy current points to new memory allocation */
+		if (node->ptrs != NULL) {
+			memcpy(ptrs, node->ptrs,
+				node->num_ptrs * sizeof(*ptrs));
+			acl_build_free(context, node->max_ptrs * sizeof(*ptrs),
+				node->ptrs);
+		}
+		node->ptrs = ptrs;
+		node->max_ptrs = num_ptrs;
+	}
+
+	/* Find available ptr and add a new pointer to this node */
+	for (n = node->min_add; n < node->max_ptrs; n++) {
+		if (node->ptrs[n].ptr == NULL) {
+			node->ptrs[n].ptr = ptr;
+			acl_include(&node->ptrs[n].values, bits, 0);
+			acl_include(&node->values, bits, -1);
+			if (ptr != NULL)
+				ptr->ref_count++;
+			if (node->num_ptrs <= n)
+				node->num_ptrs = n + 1;
+			return (0);
+		}
+	}
+
+	return (0);
+}
+
+/*
+ * Add a pointer for a range of values
+ */
+static int
+acl_add_ptr_range(struct acl_build_context *context,
+	struct rte_acl_node *root,
+	struct rte_acl_node *node,
+	uint8_t low,
+	uint8_t high)
+{
+	uint32_t n;
+	struct rte_acl_bitset bitset;
+
+	/* clear the bitset values */
+	for (n = 0; n < RTE_ACL_BIT_SET_SIZE; n++)
+		bitset.bits[n] = 0;
+
+	/* for each bit in range, add bit to set */
+	for (n = 0; n < UINT8_MAX + 1; n++)
+		if (n >= low && n <= high)
+			bitset.bits[n / (sizeof(bits_t) * 8)] |=
+				1 << (n % (sizeof(bits_t) * 8));
+
+	return (acl_add_ptr(context, root, node, &bitset));
+}
+
+/*
+ * Generate a bitset from a byte value and mask.
+ */
+static int
+acl_gen_mask(struct rte_acl_bitset *bitset, uint32_t value, uint32_t mask)
+{
+	int range = 0;
+	uint32_t n;
+
+	/* clear the bitset values */
+	for (n = 0; n < RTE_ACL_BIT_SET_SIZE; n++)
+		bitset->bits[n] = 0;
+
+	/* for each bit in value/mask, add bit to set */
+	for (n = 0; n < UINT8_MAX + 1; n++) {
+		if ((n & mask) == value) {
+			range++;
+			bitset->bits[n / (sizeof(bits_t) * 8)] |=
+				1 << (n % (sizeof(bits_t) * 8));
+		}
+	}
+	return (range);
+}
+
+/*
+ * Determine how A and B intersect.
+ * Determine if A and/or B are supersets of the intersection.
+ */
+static int
+acl_intersect_type(struct rte_acl_bitset *a_bits,
+	struct rte_acl_bitset *b_bits,
+	struct rte_acl_bitset *intersect)
+{
+	uint32_t n;
+	bits_t intersect_bits = 0;
+	bits_t a_superset = 0;
+	bits_t b_superset = 0;
+
+	/*
+	 * calculate and store intersection and check if A and/or B have
+	 * bits outside the intersection (superset)
+	 */
+	for (n = 0; n < RTE_ACL_BIT_SET_SIZE; n++) {
+		intersect->bits[n] = a_bits->bits[n] & b_bits->bits[n];
+		a_superset |= a_bits->bits[n] ^ intersect->bits[n];
+		b_superset |= b_bits->bits[n] ^ intersect->bits[n];
+		intersect_bits |= intersect->bits[n];
+	}
+
+	n = (intersect_bits == 0 ? ACL_INTERSECT_NONE : ACL_INTERSECT) |
+		(b_superset == 0 ? 0 : ACL_INTERSECT_B) |
+		(a_superset == 0 ? 0 : ACL_INTERSECT_A);
+
+	return (n);
+}
+
+/*
+ * Check if all bits in the bitset are on
+ */
+static int
+acl_full(struct rte_acl_node *node)
+{
+	uint32_t n;
+	bits_t all_bits = -1;
+
+	for (n = 0; n < RTE_ACL_BIT_SET_SIZE; n++)
+		all_bits &= node->values.bits[n];
+	return (all_bits == -1);
+}
+
+/*
+ * Check if all bits in the bitset are off
+ */
+static int
+acl_empty(struct rte_acl_node *node)
+{
+	uint32_t n;
+
+	if (node->ref_count == 0) {
+		for (n = 0; n < RTE_ACL_BIT_SET_SIZE; n++) {
+			if (0 != node->values.bits[n])
+				return 0;
+		}
+		return (1);
+	} else {
+		return (0);
+	}
+}
+
+/*
+ * Compute intersection of A and B
+ * return 1 if there is an intersection else 0.
+ */
+static int
+acl_intersect(struct rte_acl_bitset *a_bits,
+	struct rte_acl_bitset *b_bits,
+	struct rte_acl_bitset *intersect)
+{
+	uint32_t n;
+	bits_t all_bits = 0;
+
+	for (n = 0; n < RTE_ACL_BIT_SET_SIZE; n++) {
+		intersect->bits[n] = a_bits->bits[n] & b_bits->bits[n];
+		all_bits |= intersect->bits[n];
+	}
+	return (all_bits != 0);
+}
+
+/*
+ * Duplicate a node
+ */
+static struct rte_acl_node *
+acl_dup_node(struct acl_build_context *context, struct rte_acl_node *node)
+{
+	uint32_t n;
+	struct rte_acl_node *next;
+
+	if ((next = acl_alloc_node(context, node->level)) == NULL)
+		return (NULL);
+
+	/* allocate the pointers */
+	if (node->num_ptrs > 0) {
+		next->ptrs = acl_build_alloc(context,
+			node->max_ptrs,
+			sizeof(struct rte_acl_ptr_set));
+		if (next->ptrs == NULL)
+			return (NULL);
+		next->max_ptrs = node->max_ptrs;
+	}
+
+	/* copy over the pointers */
+	for (n = 0; n < node->num_ptrs; n++) {
+		if (node->ptrs[n].ptr != NULL) {
+			next->ptrs[n].ptr = node->ptrs[n].ptr;
+			next->ptrs[n].ptr->ref_count++;
+			acl_include(&next->ptrs[n].values,
+				&node->ptrs[n].values, -1);
+		}
+	}
+
+	next->num_ptrs = node->num_ptrs;
+
+	/* copy over node's match results */
+	if (node->match_flag == 0)
+		next->match_flag = 0;
+	else {
+		next->match_flag = -1;
+		next->mrt = acl_build_alloc(context, 1, sizeof(*next->mrt));
+		memcpy(next->mrt, node->mrt, sizeof(*next->mrt));
+	}
+
+	/* copy over node's bitset */
+	acl_include(&next->values, &node->values, -1);
+
+	node->next = next;
+	next->prev = node;
+
+	return (next);
+}
+
+/*
+ * Dereference a pointer from a node
+ */
+static void
+acl_deref_ptr(struct acl_build_context *context,
+	struct rte_acl_node *node, int index)
+{
+	struct rte_acl_node *ref_node;
+
+	/* De-reference the node at the specified pointer */
+	if (node != NULL && node->ptrs[index].ptr != NULL) {
+		ref_node = node->ptrs[index].ptr;
+		ref_node->ref_count--;
+		if (ref_node->ref_count == 0)
+			acl_free_node(context, ref_node);
+	}
+}
+
+/*
+ * Exclude bitset from a node pointer
+ * returns  0 if poiter was deref'd
+ *          1 otherwise.
+ */
+static int
+acl_exclude_ptr(struct acl_build_context *context,
+	struct rte_acl_node *node,
+	int index,
+	struct rte_acl_bitset *b_bits)
+{
+	int retval = 1;
+
+	/*
+	 * remove bitset from node pointer and deref
+	 * if the bitset becomes empty.
+	 */
+	if (!acl_exclude(&node->ptrs[index].values,
+			&node->ptrs[index].values,
+			b_bits)) {
+		acl_deref_ptr(context, node, index);
+		node->ptrs[index].ptr = NULL;
+		retval = 0;
+	}
+
+	/* exclude bits from the composite bits for the node */
+	acl_exclude(&node->values, &node->values, b_bits);
+	return retval;
+}
+
+/*
+ * Remove a bitset from src ptr and move remaining ptr to dst
+ */
+static int
+acl_move_ptr(struct acl_build_context *context,
+	struct rte_acl_node *dst,
+	struct rte_acl_node *src,
+	int index,
+	struct rte_acl_bitset *b_bits)
+{
+	int rc;
+
+	if (b_bits != NULL)
+		if (!acl_exclude_ptr(context, src, index, b_bits))
+			return (0);
+
+	/* add src pointer to dst node */
+	if ((rc = acl_add_ptr(context, dst, src->ptrs[index].ptr,
+			&src->ptrs[index].values)) < 0)
+		return (rc);
+
+	/* remove ptr from src */
+	acl_exclude_ptr(context, src, index, &src->ptrs[index].values);
+	return (1);
+}
+
+/*
+ * acl_exclude rte_acl_bitset from src and copy remaining pointer to dst
+ */
+static int
+acl_copy_ptr(struct acl_build_context *context,
+	struct rte_acl_node *dst,
+	struct rte_acl_node *src,
+	int index,
+	struct rte_acl_bitset *b_bits)
+{
+	int rc;
+	struct rte_acl_bitset bits;
+
+	if (b_bits != NULL)
+		if (!acl_exclude(&bits, &src->ptrs[index].values, b_bits))
+			return (0);
+
+	if ((rc = acl_add_ptr(context, dst, src->ptrs[index].ptr, &bits)) < 0)
+		return (rc);
+	return (1);
+}
+
+/*
+ * Fill in gaps in ptrs list with the ptr at the end of the list
+ */
+static void
+acl_compact_node_ptrs(struct rte_acl_node *node_a)
+{
+	uint32_t n;
+	int min_add = node_a->min_add;
+
+	while (node_a->num_ptrs > 0  &&
+			node_a->ptrs[node_a->num_ptrs - 1].ptr == NULL)
+		node_a->num_ptrs--;
+
+	for (n = min_add; n + 1 < node_a->num_ptrs; n++) {
+
+		/* if this entry is empty */
+		if (node_a->ptrs[n].ptr == NULL) {
+
+			/* move the last pointer to this entry */
+			acl_include(&node_a->ptrs[n].values,
+				&node_a->ptrs[node_a->num_ptrs - 1].values,
+				0);
+			node_a->ptrs[n].ptr =
+				node_a->ptrs[node_a->num_ptrs - 1].ptr;
+
+			/*
+			 * mark the end as empty and adjust the number
+			 * of used pointer enum_tries
+			 */
+			node_a->ptrs[node_a->num_ptrs - 1].ptr = NULL;
+			while (node_a->num_ptrs > 0  &&
+				node_a->ptrs[node_a->num_ptrs - 1].ptr == NULL)
+				node_a->num_ptrs--;
+		}
+	}
+}
+
+/*
+ * acl_merge helper routine.
+ */
+static int
+acl_merge_intersect(struct acl_build_context *context,
+	struct rte_acl_node *node_a, uint32_t idx_a,
+	struct rte_acl_node *node_b, uint32_t idx_b,
+	int next_move, int level,
+	struct rte_acl_bitset *intersect_ptr)
+{
+	struct rte_acl_node *node_c;
+
+	/* Duplicate A for intersection */
+	if ((node_c = acl_dup_node(context, node_a->ptrs[idx_a].ptr)) == NULL)
+		return (-1);
+
+	/* Remove intersection from A */
+	acl_exclude_ptr(context, node_a, idx_a, intersect_ptr);
+
+	/*
+	 * Added link from A to C for all transitions
+	 * in the intersection
+	 */
+	if (acl_add_ptr(context, node_a, node_c, intersect_ptr) < 0)
+		return (-1);
+
+	/* merge B->node into C */
+	return (acl_merge(context, node_c, node_b->ptrs[idx_b].ptr, next_move,
+		0, level + 1));
+}
+
+
+/*
+ * Merge the children of nodes A and B together.
+ *
+ * if match node
+ *	For each category
+ *		node A result = highest priority result
+ * if any pointers in A intersect with any in B
+ *	For each intersection
+ *		C = copy of node that A points to
+ *		remove intersection from A pointer
+ *		add a pointer to A that points to C for the intersection
+ *		Merge C and node that B points to
+ * Compact the pointers in A and B
+ * if move flag
+ *	If B has only one reference
+ *		Move B pointers to A
+ *	else
+ *		Copy B pointers to A
+ */
+static int
+acl_merge(struct acl_build_context *context,
+	struct rte_acl_node *node_a, struct rte_acl_node *node_b,
+	int move, int a_subset, int level)
+{
+	uint32_t n, m, ptrs_a, ptrs_b;
+	uint32_t min_add_a, min_add_b;
+	int intersect_type;
+	int node_intersect_type;
+	int b_full, next_move, rc;
+	struct rte_acl_bitset intersect_values;
+	struct rte_acl_bitset intersect_ptr;
+
+	min_add_a = 0;
+	min_add_b = 0;
+	intersect_type = 0;
+	node_intersect_type = 0;
+
+	if (level == 0)
+		a_subset = 1;
+
+	/*
+	 *  Resolve match priorities
+	 */
+	if (node_a->match_flag != 0 || node_b->match_flag != 0) {
+
+		if (node_a->match_flag == 0 || node_b->match_flag == 0)
+			RTE_LOG(ERR, ACL, "Not both matches\n");
+
+		if (node_b->match_flag < node_a->match_flag)
+			RTE_LOG(ERR, ACL, "Not same match\n");
+
+		for (n = 0; n < context->cfg.num_categories; n++) {
+			if (node_a->mrt->priority[n] <
+					node_b->mrt->priority[n]) {
+				node_a->mrt->priority[n] =
+					node_b->mrt->priority[n];
+				node_a->mrt->results[n] =
+					node_b->mrt->results[n];
+			}
+		}
+	}
+
+	/*
+	 * If the two node transitions intersect then merge the transitions.
+	 * Check intersection for entire node (all pointers)
+	 */
+	node_intersect_type = acl_intersect_type(&node_a->values,
+		&node_b->values,
+		&intersect_values);
+
+	if (node_intersect_type & ACL_INTERSECT) {
+
+		b_full = acl_full(node_b);
+
+		min_add_b = node_b->min_add;
+		node_b->min_add = node_b->num_ptrs;
+		ptrs_b = node_b->num_ptrs;
+
+		min_add_a = node_a->min_add;
+		node_a->min_add = node_a->num_ptrs;
+		ptrs_a = node_a->num_ptrs;
+
+		for (n = 0; n < ptrs_a; n++) {
+			for (m = 0; m < ptrs_b; m++) {
+
+				if (node_a->ptrs[n].ptr == NULL ||
+						node_b->ptrs[m].ptr == NULL ||
+						node_a->ptrs[n].ptr ==
+						node_b->ptrs[m].ptr)
+						continue;
+
+				intersect_type = acl_intersect_type(
+					&node_a->ptrs[n].values,
+					&node_b->ptrs[m].values,
+					&intersect_ptr);
+
+				/* If this node is not a 'match' node */
+				if ((intersect_type & ACL_INTERSECT) &&
+					(context->cfg.num_categories != 1 ||
+					!(node_a->ptrs[n].ptr->match_flag))) {
+
+					/*
+					 * next merge is a 'move' pointer,
+					 * if this one is and B is a
+					 * subset of the intersection.
+					 */
+					next_move = move &&
+						(intersect_type &
+						ACL_INTERSECT_B) == 0;
+
+					if (a_subset && b_full) {
+						rc = acl_merge(context,
+							node_a->ptrs[n].ptr,
+							node_b->ptrs[m].ptr,
+							next_move,
+							1, level + 1);
+						if (rc != 0)
+							return (rc);
+					} else {
+						rc = acl_merge_intersect(
+							context, node_a, n,
+							node_b, m, next_move,
+							level, &intersect_ptr);
+						if (rc != 0)
+							return (rc);
+					}
+				}
+			}
+		}
+	}
+
+	/* Compact pointers */
+	node_a->min_add = min_add_a;
+	acl_compact_node_ptrs(node_a);
+	node_b->min_add = min_add_b;
+	acl_compact_node_ptrs(node_b);
+
+	/*
+	 *  Either COPY or MOVE pointers from B to A
+	 */
+	acl_intersect(&node_a->values, &node_b->values, &intersect_values);
+
+	if (move && node_b->ref_count == 1) {
+		for (m = 0; m < node_b->num_ptrs; m++) {
+			if (node_b->ptrs[m].ptr != NULL &&
+					acl_move_ptr(context, node_a, node_b, m,
+					&intersect_values) < 0)
+				return (-1);
+		}
+	} else {
+		for (m = 0; m < node_b->num_ptrs; m++) {
+			if (node_b->ptrs[m].ptr != NULL &&
+					acl_copy_ptr(context, node_a, node_b, m,
+					&intersect_values) < 0)
+				return (-1);
+		}
+	}
+
+	/*
+	 *  Free node if its empty (no longer used)
+	 */
+	if (acl_empty(node_b)) {
+		acl_free_node(context, node_b);
+	}
+	return (0);
+}
+
+static int
+acl_resolve_leaf(struct acl_build_context *context,
+	struct rte_acl_node *node_a,
+	struct rte_acl_node *node_b,
+	struct rte_acl_node **node_c)
+{
+	uint32_t n;
+	int combined_priority = ACL_PRIORITY_EQUAL;
+
+	for (n = 0; n < context->cfg.num_categories; n++) {
+		if (node_a->mrt->priority[n] != node_b->mrt->priority[n]) {
+			combined_priority |= (node_a->mrt->priority[n] >
+				node_b->mrt->priority[n]) ?
+				ACL_PRIORITY_NODE_A : ACL_PRIORITY_NODE_B;
+		}
+	}
+
+	/*
+	 * if node a is higher or equal priority for all categories,
+	 * then return node_a.
+	 */
+	if (combined_priority == ACL_PRIORITY_NODE_A ||
+			combined_priority == ACL_PRIORITY_EQUAL) {
+		*node_c = node_a;
+		return 0;
+	}
+
+	/*
+	 * if node b is higher or equal priority for all categories,
+	 * then return node_b.
+	 */
+	if (combined_priority == ACL_PRIORITY_NODE_B) {
+		*node_c = node_b;
+		return 0;
+	}
+
+	/*
+	 * mixed priorities - create a new node with the highest priority
+	 * for each category.
+	 */
+
+	/* force new duplication. */
+	node_a->next = NULL;
+
+	*node_c = acl_dup_node(context, node_a);
+	for (n = 0; n < context->cfg.num_categories; n++) {
+		if ((*node_c)->mrt->priority[n] < node_b->mrt->priority[n]) {
+			(*node_c)->mrt->priority[n] = node_b->mrt->priority[n];
+			(*node_c)->mrt->results[n] = node_b->mrt->results[n];
+		}
+	}
+	return 0;
+}
+
+/*
+* Within the existing trie structure, determine which nodes are
+* part of the subtree of the trie to be merged.
+*
+* For these purposes, a subtree is defined as the set of nodes that
+* are 1) not a superset of the intersection with the same level of
+* the merging tree, and 2) do not have any references from a node
+* outside of the subtree.
+*/
+static void
+mark_subtree(struct rte_acl_node *node,
+	struct rte_acl_bitset *level_bits,
+	uint32_t level,
+	uint32_t id)
+{
+	uint32_t n;
+
+	/* mark this node as part of the subtree */
+	node->subtree_id = id | RTE_ACL_SUBTREE_NODE;
+
+	for (n = 0; n < node->num_ptrs; n++) {
+
+		if (node->ptrs[n].ptr != NULL) {
+
+			struct rte_acl_bitset intersect_bits;
+			int intersect;
+
+			/*
+			* Item 1) :
+			* check if this child pointer is not a superset of the
+			* same level of the merging tree.
+			*/
+			intersect = acl_intersect_type(&node->ptrs[n].values,
+				&level_bits[level],
+				&intersect_bits);
+
+			if ((intersect & ACL_INTERSECT_A) == 0) {
+
+				struct rte_acl_node *child = node->ptrs[n].ptr;
+
+				/*
+				 * reset subtree reference if this is
+				 * the first visit by this subtree.
+				 */
+				if (child->subtree_id != id) {
+					child->subtree_id = id;
+					child->subtree_ref_count = 0;
+				}
+
+				/*
+				* Item 2) :
+				* increment the subtree reference count and if
+				* all references are from this subtree then
+				* recurse to that child
+				*/
+				child->subtree_ref_count++;
+				if (child->subtree_ref_count ==
+						child->ref_count)
+					mark_subtree(child, level_bits,
+						level + 1, id);
+			}
+		}
+	}
+}
+
+/*
+ * Build the set of bits that define the set of transitions
+ * for each level of a trie.
+ */
+static void
+build_subset_mask(struct rte_acl_node *node,
+	struct rte_acl_bitset *level_bits,
+	int level)
+{
+	uint32_t n;
+
+	/* Add this node's transitions to the set for this level */
+	for (n = 0; n < RTE_ACL_BIT_SET_SIZE; n++)
+		level_bits[level].bits[n] &= node->values.bits[n];
+
+	/* For each child, add the transitions for the next level */
+	for (n = 0; n < node->num_ptrs; n++)
+		if (node->ptrs[n].ptr != NULL)
+			build_subset_mask(node->ptrs[n].ptr, level_bits,
+				level + 1);
+}
+
+
+/*
+ * Merge nodes A and B together,
+ *   returns a node that is the path for the intersection
+ *
+ * If match node (leaf on trie)
+ *	For each category
+ *		return node = highest priority result
+ *
+ * Create C as a duplicate of A to point to child intersections
+ * If any pointers in C intersect with any in B
+ *	For each intersection
+ *		merge children
+ *		remove intersection from C pointer
+ *		add a pointer from C to child intersection node
+ * Compact the pointers in A and B
+ * Copy any B pointers that are outside of the intersection to C
+ * If C has no references to the B trie
+ *   free C and return A
+ * Else If C has no references to the A trie
+ *   free C and return B
+ * Else
+ *   return C
+ */
+static int
+acl_merge_trie(struct acl_build_context *context,
+	struct rte_acl_node *node_a, struct rte_acl_node *node_b,
+	uint32_t level, uint32_t subtree_id, struct rte_acl_node **return_c)
+{
+	uint32_t n, m, ptrs_c, ptrs_b;
+	uint32_t min_add_c, min_add_b;
+	int node_intersect_type;
+	struct rte_acl_bitset node_intersect;
+	struct rte_acl_node *node_c;
+	struct rte_acl_node *node_a_next;
+	int node_b_refs;
+	int node_a_refs;
+
+	node_c = node_a;
+	node_a_next = node_a->next;
+	min_add_c = 0;
+	min_add_b = 0;
+	node_a_refs = node_a->num_ptrs;
+	node_b_refs = 0;
+	node_intersect_type = 0;
+
+	/* Resolve leaf nodes (matches) */
+	if (node_a->match_flag != 0) {
+		acl_resolve_leaf(context, node_a, node_b, return_c);
+		return 0;
+	}
+
+	/*
+	 * Create node C as a copy of node A if node A is not part of
+	 * a subtree of the merging tree (node B side). Otherwise,
+	 * just use node A.
+	 */
+	if (level > 0 &&
+			node_a->subtree_id !=
+			(subtree_id | RTE_ACL_SUBTREE_NODE)) {
+		node_c = acl_dup_node(context, node_a);
+		node_c->subtree_id = subtree_id | RTE_ACL_SUBTREE_NODE;
+	}
+
+	/*
+	 * If the two node transitions intersect then merge the transitions.
+	 * Check intersection for entire node (all pointers)
+	 */
+	node_intersect_type = acl_intersect_type(&node_c->values,
+		&node_b->values,
+		&node_intersect);
+
+	if (node_intersect_type & ACL_INTERSECT) {
+
+		min_add_b = node_b->min_add;
+		node_b->min_add = node_b->num_ptrs;
+		ptrs_b = node_b->num_ptrs;
+
+		min_add_c = node_c->min_add;
+		node_c->min_add = node_c->num_ptrs;
+		ptrs_c = node_c->num_ptrs;
+
+		for (n = 0; n < ptrs_c; n++) {
+			if (node_c->ptrs[n].ptr == NULL) {
+				node_a_refs--;
+				continue;
+			}
+			node_c->ptrs[n].ptr->next = NULL;
+			for (m = 0; m < ptrs_b; m++) {
+
+				struct rte_acl_bitset child_intersect;
+				int child_intersect_type;
+				struct rte_acl_node *child_node_c = NULL;
+
+				if (node_b->ptrs[m].ptr == NULL ||
+						node_c->ptrs[n].ptr ==
+						node_b->ptrs[m].ptr)
+						continue;
+
+				child_intersect_type = acl_intersect_type(
+					&node_c->ptrs[n].values,
+					&node_b->ptrs[m].values,
+					&child_intersect);
+
+				if ((child_intersect_type & ACL_INTERSECT) !=
+						0) {
+					if (acl_merge_trie(context,
+							node_c->ptrs[n].ptr,
+							node_b->ptrs[m].ptr,
+							level + 1, subtree_id,
+							&child_node_c))
+						return 1;
+
+					if (child_node_c != NULL &&
+							child_node_c !=
+							node_c->ptrs[n].ptr) {
+
+						node_b_refs++;
+
+						/*
+						 * Added link from C to
+						 * child_C for all transitions
+						 * in the intersection.
+						 */
+						acl_add_ptr(context, node_c,
+							child_node_c,
+							&child_intersect);
+
+						/*
+						 * inc refs if pointer is not
+						 * to node b.
+						 */
+						node_a_refs += (child_node_c !=
+							node_b->ptrs[m].ptr);
+
+						/*
+						 * Remove intersection from C
+						 * pointer.
+						 */
+						if (!acl_exclude(
+							&node_c->ptrs[n].values,
+							&node_c->ptrs[n].values,
+							&child_intersect)) {
+							acl_deref_ptr(context,
+								node_c, n);
+							node_c->ptrs[n].ptr =
+								NULL;
+							node_a_refs--;
+						}
+					}
+				}
+			}
+		}
+
+		/* Compact pointers */
+		node_c->min_add = min_add_c;
+		acl_compact_node_ptrs(node_c);
+		node_b->min_add = min_add_b;
+		acl_compact_node_ptrs(node_b);
+	}
+
+	/*
+	 *  Copy pointers outside of the intersection from B to C
+	 */
+	if ((node_intersect_type & ACL_INTERSECT_B) != 0) {
+		node_b_refs++;
+		for (m = 0; m < node_b->num_ptrs; m++)
+			if (node_b->ptrs[m].ptr != NULL)
+				acl_copy_ptr(context, node_c,
+					node_b, m, &node_intersect);
+	}
+
+	/*
+	 * Free node C if top of trie is contained in A or B
+	 *  if node C is a duplicate of node A &&
+	 *     node C was not an existing duplicate
+	 */
+	if (node_c != node_a && node_c != node_a_next) {
+
+		/*
+		 * if the intersection has no references to the
+		 * B side, then it is contained in A
+		 */
+		if (node_b_refs == 0) {
+			acl_free_node(context, node_c);
+			node_c = node_a;
+		} else {
+			/*
+			 * if the intersection has no references to the
+			 * A side, then it is contained in B.
+			 */
+			if (node_a_refs == 0) {
+				acl_free_node(context, node_c);
+				node_c = node_b;
+			}
+		}
+	}
+
+	if (return_c != NULL)
+		*return_c = node_c;
+
+	if (level == 0)
+		acl_free_node(context, node_b);
+
+	return 0;
+}
+
+/*
+ * Reset current runtime fields before next build:
+ *  - free allocated RT memory.
+ *  - reset all RT related fields to zero.
+ */
+static void
+acl_build_reset(struct rte_acl_ctx *ctx)
+{
+	rte_free(ctx->mem);
+	memset(&ctx->num_categories, 0,
+		sizeof(*ctx) - offsetof(struct rte_acl_ctx, num_categories));
+}
+
+static void
+acl_gen_range(struct acl_build_context *context,
+	const uint8_t *hi, const uint8_t *lo, int size, int level,
+	struct rte_acl_node *root, struct rte_acl_node *end)
+{
+	struct rte_acl_node *node, *prev;
+	uint32_t n;
+
+	prev = root;
+	for (n = size - 1; n > 0; n--) {
+		node = acl_alloc_node(context, level++);
+		acl_add_ptr_range(context, prev, node, lo[n], hi[n]);
+		prev = node;
+	}
+	acl_add_ptr_range(context, prev, end, lo[0], hi[0]);
+}
+
+static struct rte_acl_node *
+acl_gen_range_trie(struct acl_build_context *context,
+	const void *min, const void *max,
+	int size, int level, struct rte_acl_node **pend)
+{
+	int32_t n;
+	struct rte_acl_node *root;
+	const uint8_t *lo = (const uint8_t *)min;
+	const uint8_t *hi = (const uint8_t *)max;
+
+	*pend = acl_alloc_node(context, level+size);
+	root = acl_alloc_node(context, level++);
+
+	if (lo[size - 1] == hi[size - 1]) {
+		acl_gen_range(context, hi, lo, size, level, root, *pend);
+	} else {
+		uint8_t limit_lo[64];
+		uint8_t limit_hi[64];
+		uint8_t hi_ff = UINT8_MAX;
+		uint8_t lo_00 = 0;
+
+		memset(limit_lo, 0, RTE_DIM(limit_lo));
+		memset(limit_hi, UINT8_MAX, RTE_DIM(limit_hi));
+
+		for (n = size - 2; n >= 0; n--) {
+			hi_ff = (uint8_t)(hi_ff & hi[n]);
+			lo_00 = (uint8_t)(lo_00 | lo[n]);
+		}
+
+		if (hi_ff != UINT8_MAX) {
+			limit_lo[size - 1] = hi[size - 1];
+			acl_gen_range(context, hi, limit_lo, size, level,
+				root, *pend);
+		}
+
+		if (lo_00 != 0) {
+			limit_hi[size - 1] = lo[size - 1];
+			acl_gen_range(context, limit_hi, lo, size, level,
+				root, *pend);
+		}
+
+		if (hi[size - 1] - lo[size - 1] > 1 ||
+				lo_00 == 0 ||
+				hi_ff == UINT8_MAX) {
+			limit_lo[size-1] = (uint8_t)(lo[size-1] + (lo_00 != 0));
+			limit_hi[size-1] = (uint8_t)(hi[size-1] -
+				(hi_ff != UINT8_MAX));
+			acl_gen_range(context, limit_hi, limit_lo, size,
+				level, root, *pend);
+		}
+	}
+	return (root);
+}
+
+static struct rte_acl_node *
+acl_gen_mask_trie(struct acl_build_context *context,
+	const void *value, const void *mask,
+	int size, int level, struct rte_acl_node **pend)
+{
+	int32_t n;
+	struct rte_acl_node *root;
+	struct rte_acl_node *node, *prev;
+	struct rte_acl_bitset bits;
+	const uint8_t *val = (const uint8_t *)value;
+	const uint8_t *msk = (const uint8_t *)mask;
+
+	root = acl_alloc_node(context, level++);
+	prev = root;
+
+	for (n = size - 1; n >= 0; n--) {
+		node = acl_alloc_node(context, level++);
+		acl_gen_mask(&bits, val[n] & msk[n], msk[n]);
+		acl_add_ptr(context, prev, node, &bits);
+		prev = node;
+	}
+
+	*pend = prev;
+	return (root);
+}
+
+static struct rte_acl_node *
+build_trie(struct acl_build_context *context, struct rte_acl_build_rule *head,
+	struct rte_acl_build_rule **last, uint32_t *count)
+{
+	uint32_t n, m;
+	int field_index, node_count;
+	struct rte_acl_node *trie;
+	struct rte_acl_build_rule *prev, *rule;
+	struct rte_acl_node *end, *merge, *root, *end_prev;
+	const struct rte_acl_field *fld;
+	struct rte_acl_bitset level_bits[RTE_ACL_MAX_LEVELS];
+
+	prev = head;
+	rule = head;
+
+	if ((trie = acl_alloc_node(context, 0)) == NULL)
+		return (NULL);
+
+	while (rule != NULL) {
+
+		if ((root = acl_alloc_node(context, 0)) == NULL)
+			return (NULL);
+
+		root->ref_count = 1;
+		end = root;
+
+		for (n = 0; n < rule->config->num_fields; n++) {
+
+			field_index = rule->config->defs[n].field_index;
+			fld = rule->f->field + field_index;
+			end_prev = end;
+
+			/* build a mini-trie for this field */
+			switch (rule->config->defs[n].type) {
+
+			case RTE_ACL_FIELD_TYPE_BITMASK:
+				merge = acl_gen_mask_trie(context,
+					&fld->value,
+					&fld->mask_range,
+					rule->config->defs[n].size,
+					end->level + 1,
+					&end);
+				break;
+
+			case RTE_ACL_FIELD_TYPE_MASK:
+			{
+				/*
+				 * set msb for the size of the field and
+				 * all higher bits.
+				 */
+				uint64_t mask;
+
+				if (fld->mask_range.u32 == 0) {
+					mask = 0;
+
+				/*
+				 * arithmetic right shift for the length of
+				 * the mask less the msb.
+				 */
+				} else {
+					mask = -1 <<
+						(rule->config->defs[n].size *
+						CHAR_BIT - fld->mask_range.u32);
+				}
+
+				/* gen a mini-trie for this field */
+				merge = acl_gen_mask_trie(context,
+					&fld->value,
+					(char *)&mask,
+					rule->config->defs[n].size,
+					end->level + 1,
+					&end);
+			}
+			break;
+
+			case RTE_ACL_FIELD_TYPE_RANGE:
+				merge = acl_gen_range_trie(context,
+					&rule->f->field[field_index].value,
+					&rule->f->field[field_index].mask_range,
+					rule->config->defs[n].size,
+					end->level + 1,
+					&end);
+				break;
+
+			default:
+				RTE_LOG(ERR, ACL,
+					"Error in rule[%u] type - %hhu\n",
+					rule->f->data.userdata,
+					rule->config->defs[n].type);
+				return (NULL);
+			}
+
+			/* merge this field on to the end of the rule */
+			if (acl_merge_trie(context, end_prev, merge, 0,
+					0, NULL) != 0) {
+				return (NULL);
+			}
+		}
+
+		end->match_flag = ++context->num_build_rules;
+
+		/*
+		 * Setup the results for this rule.
+		 * The result and priority of each category.
+		 */
+		if (end->mrt == NULL &&
+				(end->mrt = acl_build_alloc(context, 1,
+				sizeof(*end->mrt))) == NULL)
+			return (NULL);
+
+		for (m = 0; m < context->cfg.num_categories; m++) {
+			if (rule->f->data.category_mask & (1 << m)) {
+				end->mrt->results[m] = rule->f->data.userdata;
+				end->mrt->priority[m] = rule->f->data.priority;
+			} else {
+				end->mrt->results[m] = 0;
+				end->mrt->priority[m] = 0;
+			}
+		}
+
+		node_count = context->num_nodes;
+
+		memset(&level_bits[0], UINT8_MAX, sizeof(level_bits));
+		build_subset_mask(root, &level_bits[0], 0);
+		mark_subtree(trie, &level_bits[0], 0, end->match_flag);
+		(*count)++;
+
+		/* merge this rule into the trie */
+		if (acl_merge_trie(context, trie, root, 0, end->match_flag,
+			NULL))
+			return NULL;
+
+		node_count = context->num_nodes - node_count;
+		if (node_count > NODE_MAX) {
+			*last = prev;
+			return trie;
+		}
+
+		prev = rule;
+		rule = rule->next;
+	}
+
+	*last = NULL;
+	return trie;
+}
+
+static int
+acl_calc_wildness(struct rte_acl_build_rule *head,
+	const struct rte_acl_config *config)
+{
+	uint32_t n;
+	struct rte_acl_build_rule *rule;
+
+	for (rule = head; rule != NULL; rule = rule->next) {
+
+		for (n = 0; n < config->num_fields; n++) {
+
+			double wild = 0;
+			double size = CHAR_BIT * config->defs[n].size;
+			int field_index = config->defs[n].field_index;
+			const struct rte_acl_field *fld = rule->f->field +
+				field_index;
+
+			switch (rule->config->defs[n].type) {
+			case RTE_ACL_FIELD_TYPE_BITMASK:
+				wild = (size -
+					_mm_popcnt_u32(fld->mask_range.u8)) /
+					size;
+				break;
+
+			case RTE_ACL_FIELD_TYPE_MASK:
+				wild = (size - fld->mask_range.u32) / size;
+				break;
+
+			case RTE_ACL_FIELD_TYPE_RANGE:
+				switch (rule->config->defs[n].size) {
+				case sizeof(uint8_t):
+					wild = ((double)fld->mask_range.u8 -
+						fld->value.u8) / UINT8_MAX;
+					break;
+				case sizeof(uint16_t):
+					wild = ((double)fld->mask_range.u16 -
+						fld->value.u16) / UINT16_MAX;
+					break;
+				case sizeof(uint32_t):
+					wild = ((double)fld->mask_range.u32 -
+						fld->value.u32) / UINT32_MAX;
+					break;
+				case sizeof(uint64_t):
+					wild = ((double)fld->mask_range.u64 -
+						fld->value.u64) / UINT64_MAX;
+					break;
+				default:
+					RTE_LOG(ERR, ACL,
+						"%s(rule: %u) invalid %u-th "
+						"field, type: %hhu, "
+						"unknown size: %hhu\n",
+						__func__,
+						rule->f->data.userdata,
+						n,
+						rule->config->defs[n].type,
+						rule->config->defs[n].size);
+					return (-EINVAL);
+				}
+				break;
+
+			default:
+				RTE_LOG(ERR, ACL,
+					"%s(rule: %u) invalid %u-th "
+					"field, unknown type: %hhu\n",
+					__func__,
+					rule->f->data.userdata,
+					n,
+					rule->config->defs[n].type);
+					return (-EINVAL);
+
+			}
+
+			rule->wildness[field_index] = (uint32_t)(wild * 100);
+		}
+	}
+
+	return (0);
+}
+
+static int
+acl_rule_stats(struct rte_acl_build_rule *head, struct rte_acl_config *config,
+	uint32_t *wild_limit)
+{
+	int min;
+	struct rte_acl_build_rule *rule;
+	uint32_t n, m, fields_deactivated = 0;
+	uint32_t start = 0, deactivate = 0;
+	int tally[RTE_ACL_MAX_LEVELS][TALLY_NUM];
+
+	memset(tally, 0, sizeof(tally));
+
+	for (rule = head; rule != NULL; rule = rule->next) {
+
+		for (n = 0; n < config->num_fields; n++) {
+			uint32_t field_index = config->defs[n].field_index;
+
+			tally[n][TALLY_0]++;
+			for (m = 1; m < RTE_DIM(wild_limits); m++) {
+				if (rule->wildness[field_index] >=
+						wild_limits[m])
+					tally[n][m]++;
+			}
+		}
+
+		for (n = config->num_fields - 1; n > 0; n--) {
+			uint32_t field_index = config->defs[n].field_index;
+
+			if (rule->wildness[field_index] == 100)
+				tally[n][TALLY_DEPTH]++;
+			else
+				break;
+		}
+	}
+
+	/*
+	 * Look for any field that is always wild and drop it from the config
+	 * Only deactivate if all fields for a given input loop are deactivated.
+	 */
+	for (n = 1; n < config->num_fields; n++) {
+		if (config->defs[n].input_index !=
+				config->defs[n - 1].input_index) {
+			for (m = start; m < n; m++)
+				tally[m][TALLY_DEACTIVATED] = deactivate;
+			fields_deactivated += deactivate;
+			start = n;
+			deactivate = 1;
+		}
+
+		/* if the field is not always completely wild */
+		if (tally[n][TALLY_100] != tally[n][TALLY_0])
+			deactivate = 0;
+	}
+
+	for (m = start; m < n; m++)
+		tally[m][TALLY_DEACTIVATED] = deactivate;
+
+	fields_deactivated += deactivate;
+
+	/* remove deactivated fields */
+	if (fields_deactivated) {
+		uint32_t k, l = 0;
+
+		for (k = 0; k < config->num_fields; k++) {
+			if (tally[k][TALLY_DEACTIVATED] == 0) {
+				memcpy(&tally[l][0], &tally[k][0],
+					TALLY_NUM * sizeof(tally[0][0]));
+				memcpy(&config->defs[l++],
+					&config->defs[k],
+					sizeof(struct rte_acl_field_def));
+			}
+		}
+		config->num_fields = l;
+	}
+
+	min = RTE_ACL_SINGLE_TRIE_SIZE;
+	if (config->num_fields == 2)
+		min *= 4;
+	else if (config->num_fields == 3)
+		min *= 3;
+	else if (config->num_fields == 4)
+		min *= 2;
+
+	if (tally[0][TALLY_0] < min)
+		return 0;
+	for (n = 0; n < config->num_fields; n++)
+		wild_limit[n] = 0;
+
+	/*
+	 * If trailing fields are 100% wild, group those together.
+	 * This allows the search length of the trie to be shortened.
+	 */
+	for (n = 1; n < config->num_fields; n++) {
+
+		double rule_percentage = (double)tally[n][TALLY_DEPTH] /
+			tally[n][0];
+
+		if (rule_percentage > RULE_PERCENTAGE) {
+			/* if it crosses an input boundary then round up */
+			while (config->defs[n - 1].input_index ==
+					config->defs[n].input_index)
+				n++;
+
+			/* set the limit for selecting rules */
+			while (n < config->num_fields)
+				wild_limit[n++] = 100;
+
+			if (wild_limit[n - 1] == 100)
+				return 1;
+		}
+	}
+
+	/* look for the most wild that's 40% or more of the rules */
+	for (n = 1; n < config->num_fields; n++) {
+		for (m = TALLY_100; m > 0; m--) {
+
+			double rule_percentage = (double)tally[n][m] /
+				tally[n][0];
+
+			if (tally[n][TALLY_DEACTIVATED] == 0 &&
+					tally[n][TALLY_0] >
+					RTE_ACL_SINGLE_TRIE_SIZE &&
+					rule_percentage > NODE_PERCENTAGE &&
+					rule_percentage < 0.80) {
+				wild_limit[n] = wild_limits[m];
+				return 1;
+			}
+		}
+	}
+	return 0;
+}
+
+static int
+order(struct rte_acl_build_rule **insert, struct rte_acl_build_rule *rule)
+{
+	uint32_t n;
+	struct rte_acl_build_rule *left = *insert;
+
+	if (left == NULL)
+		return (0);
+
+	for (n = 1; n < left->config->num_fields; n++) {
+		int field_index = left->config->defs[n].field_index;
+
+		if (left->wildness[field_index] != rule->wildness[field_index])
+			return (left->wildness[field_index] >=
+				rule->wildness[field_index]);
+	}
+	return (0);
+}
+
+static struct rte_acl_build_rule *
+ordered_insert_rule(struct rte_acl_build_rule *head,
+	struct rte_acl_build_rule *rule)
+{
+	struct rte_acl_build_rule **insert;
+
+	if (rule == NULL)
+		return head;
+
+	rule->next = head;
+	if (head == NULL)
+		return rule;
+
+	insert = &head;
+	while (order(insert, rule)) {
+		insert = &(*insert)->next;
+	}
+
+	rule->next = *insert;
+	*insert = rule;
+	return (head);
+}
+
+static struct rte_acl_build_rule *
+sort_rules(struct rte_acl_build_rule *head)
+{
+	struct rte_acl_build_rule *rule, *reordered_head = NULL;
+	struct rte_acl_build_rule *last_rule = NULL;
+
+	for (rule = head; rule != NULL; rule = rule->next) {
+		reordered_head = ordered_insert_rule(reordered_head, last_rule);
+		last_rule = rule;
+	}
+
+	if (last_rule != reordered_head) {
+		reordered_head = ordered_insert_rule(reordered_head, last_rule);
+	}
+
+	return reordered_head;
+}
+
+static uint32_t
+acl_build_index(const struct rte_acl_config *config, uint32_t *data_index)
+{
+	uint32_t n, m;
+	int32_t last_header;
+
+	m = 0;
+	last_header = -1;
+
+	for (n = 0; n < config->num_fields; n++) {
+		if (last_header != config->defs[n].input_index) {
+			last_header = config->defs[n].input_index;
+			data_index[m++] = config->defs[n].offset;
+		}
+	}
+
+	return (m);
+}
+
+static int
+acl_build_tries(struct acl_build_context *context,
+	struct rte_acl_build_rule *head)
+{
+	int32_t rc;
+	uint32_t n, m, num_tries;
+	struct rte_acl_config *config;
+	struct rte_acl_build_rule *last, *rule;
+	uint32_t wild_limit[RTE_ACL_MAX_LEVELS];
+	struct rte_acl_build_rule *rule_sets[RTE_ACL_MAX_TRIES];
+
+	config = head->config;
+	rule = head;
+	rule_sets[0] = head;
+	num_tries = 1;
+
+	/* initialize tries */
+	for (n = 0; n < RTE_DIM(context->tries); n++) {
+		context->tries[n].type = RTE_ACL_UNUSED_TRIE;
+		context->bld_tries[n].trie = NULL;
+		context->tries[n].count = 0;
+		context->tries[n].smallest = INT32_MAX;
+	}
+
+	context->tries[0].type = RTE_ACL_FULL_TRIE;
+
+	/* calc wildness of each field of each rule */
+	if ((rc = acl_calc_wildness(head, config)) != 0)
+		return (rc);
+
+	n = acl_rule_stats(head, config, &wild_limit[0]);
+
+	/* put all rules that fit the wildness criteria into a seperate trie */
+	while (n > 0 && num_tries < RTE_ACL_MAX_TRIES) {
+
+		struct rte_acl_config *new_config;
+		struct rte_acl_build_rule **prev = &rule_sets[num_tries - 1];
+		struct rte_acl_build_rule *next = head->next;
+
+		if ((new_config = acl_build_alloc(context, 1,
+				sizeof(*new_config))) == NULL) {
+			RTE_LOG(ERR, ACL,
+				"Failed to geti space for new config\n");
+			return (-ENOMEM);
+		}
+
+		memcpy(new_config, config, sizeof(*new_config));
+		config = new_config;
+		rule_sets[num_tries] = NULL;
+
+		for (rule = head; rule != NULL; rule = next) {
+
+			int move = 1;
+
+			next = rule->next;
+			for (m = 0; m < config->num_fields; m++) {
+				int x = config->defs[m].field_index;
+				if (rule->wildness[x] < wild_limit[m]) {
+					move = 0;
+					break;
+				}
+			}
+
+			if (move) {
+				rule->config = new_config;
+				rule->next = rule_sets[num_tries];
+				rule_sets[num_tries] = rule;
+				*prev = next;
+			} else
+				prev = &rule->next;
+		}
+
+		head = rule_sets[num_tries];
+		n = acl_rule_stats(rule_sets[num_tries], config,
+			&wild_limit[0]);
+		num_tries++;
+	}
+
+	if (n > 0)
+		RTE_LOG(DEBUG, ACL,
+			"Number of tries(%d) exceeded.\n", RTE_ACL_MAX_TRIES);
+
+	for (n = 0; n < num_tries; n++) {
+
+		rule_sets[n] = sort_rules(rule_sets[n]);
+		context->tries[n].type = RTE_ACL_FULL_TRIE;
+		context->tries[n].count = 0;
+		context->tries[n].num_data_indexes =
+			acl_build_index(rule_sets[n]->config,
+			context->data_indexes[n]);
+		context->tries[n].data_index = context->data_indexes[n];
+
+		if ((context->bld_tries[n].trie =
+				build_trie(context, rule_sets[n],
+				&last, &context->tries[n].count)) == NULL) {
+			RTE_LOG(ERR, ACL, "Build of %u-th trie failed\n", n);
+			return (-ENOMEM);
+		}
+
+
+		if (last != NULL) {
+			rule_sets[num_tries++] = last->next;
+			last->next = NULL;
+			acl_free_node(context, context->bld_tries[n].trie);
+			context->tries[n].count = 0;
+
+			if ((context->bld_tries[n].trie =
+					build_trie(context,
+					rule_sets[n], &last,
+					&context->tries[n].count)) == NULL) {
+				RTE_LOG(ERR, ACL,
+					"Build of %u-th trie failed\n", n);
+					return (-ENOMEM);
+			}
+		}
+	}
+
+	context->num_tries = num_tries;
+	return (0);
+}
+
+static void
+acl_build_log(const struct acl_build_context *ctx)
+{
+	uint32_t n;
+
+	RTE_LOG(DEBUG, ACL, "Build phase for ACL \"%s\":\n"
+		"memory consumed: %zu\n",
+		ctx->acx->name,
+		ctx->pool.alloc);
+
+	for (n = 0; n < RTE_DIM(ctx->tries); n++) {
+		if (ctx->tries[n].count != 0)
+			RTE_LOG(DEBUG, ACL,
+				"trie %u: number of rules: %u\n",
+				n, ctx->tries[n].count);
+	}
+}
+
+static int
+acl_build_rules(struct acl_build_context *bcx)
+{
+	struct rte_acl_build_rule *br, *head;
+	const struct rte_acl_rule *rule;
+	uint32_t *wp;
+	uint32_t fn, i, n, num;
+	size_t ofs, sz;
+
+	fn = bcx->cfg.num_fields;
+	n = bcx->acx->num_rules;
+	ofs = n * sizeof(*br);
+	sz = ofs + n * fn * sizeof(*wp);
+
+	if ((br = tb_alloc(&bcx->pool, sz)) == NULL) {
+		RTE_LOG(ERR, ACL, "ACL conext %s: failed to create a copy "
+			"of %u build rules (%zu bytes)\n",
+			bcx->acx->name, n, sz);
+		return (-ENOMEM);
+	}
+
+	wp = (uint32_t *)((uintptr_t)br + ofs);
+	num = 0;
+	head = NULL;
+
+	for (i = 0; i != n; i++) {
+		rule = (const struct rte_acl_rule *)
+			((uintptr_t)bcx->acx->rules + bcx->acx->rule_sz * i);
+		if ((rule->data.category_mask & bcx->category_mask) != 0) {
+			br[num].next = head;
+			br[num].config = &bcx->cfg;
+			br[num].f = rule;
+			br[num].wildness = wp;
+			wp += fn;
+			head = br + num;
+			num++;
+		}
+	}
+
+	bcx->num_rules = num;
+	bcx->build_rules = head;
+
+	return (0);
+}
+
+/*
+ * Copy data_indexes for each trie into RT location.
+ */
+static void
+acl_set_data_indexes(struct rte_acl_ctx *ctx)
+{
+	uint32_t i, n, ofs;
+
+	ofs = 0;
+	for (i = 0; i != ctx->num_tries; i++) {
+		n = ctx->trie[i].num_data_indexes;
+		memcpy(ctx->data_indexes + ofs, ctx->trie[i].data_index,
+			n * sizeof(ctx->data_indexes[0]));
+		ctx->trie[i].data_index = ctx->data_indexes + ofs;
+		ofs += n;
+	}
+}
+
+
+int
+rte_acl_build(struct rte_acl_ctx *ctx, const struct rte_acl_config *cfg)
+{
+	int rc;
+	struct acl_build_context bcx;
+
+	if (ctx == NULL || cfg == NULL || cfg->num_categories == 0 ||
+			cfg->num_categories > RTE_ACL_MAX_CATEGORIES)
+		return -(EINVAL);
+
+	acl_build_reset(ctx);
+
+	memset(&bcx, 0, sizeof(bcx));
+	bcx.acx = ctx;
+	bcx.pool.alignment = ACL_POOL_ALIGN;
+	bcx.pool.min_alloc = ACL_POOL_ALLOC_MIN;
+	bcx.cfg = *cfg;
+	bcx.category_mask = LEN2MASK(bcx.cfg.num_categories);
+
+
+	/* Create a buid rules copy. */
+	if ((rc = acl_build_rules(&bcx)) != 0)
+		return (rc);
+
+	/* No rules to build for that context+config */
+	if (bcx.build_rules == NULL) {
+		rc = -EINVAL;
+
+	/* build internal trie representation. */
+	} else if ((rc = acl_build_tries(&bcx, bcx.build_rules)) == 0) {
+
+		/* allocate and fill run-time  structures. */
+		if ((rc = rte_acl_gen(ctx, bcx.tries, bcx.bld_tries,
+				bcx.num_tries, bcx.cfg.num_categories,
+				RTE_ACL_IPV4VLAN_NUM * RTE_DIM(bcx.tries),
+				bcx.num_build_rules)) == 0) {
+
+			/* set data indexes. */
+			acl_set_data_indexes(ctx);
+
+			/* copy in build config. */
+			ctx->config = *cfg;
+		}
+	}
+
+	acl_build_log(&bcx);
+
+	/* cleanup after build. */
+	tb_free_pool(&bcx.pool);
+	return (rc);
+}
diff --git a/lib/librte_acl/acl_gen.c b/lib/librte_acl/acl_gen.c
new file mode 100644
index 0000000..4b4862c
--- /dev/null
+++ b/lib/librte_acl/acl_gen.c
@@ -0,0 +1,473 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <rte_acl.h>
+#include "acl_vect.h"
+#include "acl.h"
+
+#define	QRANGE_MIN	((uint8_t)INT8_MIN)
+
+#define	RTE_ACL_VERIFY(exp)	do {                                          \
+	if (!(exp))                                                           \
+		rte_panic("line %d\tassert \"" #exp "\" failed\n", __LINE__); \
+} while (0)
+
+struct acl_node_counters {
+	int                match;
+	int                match_used;
+	int                single;
+	int                quad;
+	int                quad_vectors;
+	int                dfa;
+	int                smallest_match;
+};
+
+struct rte_acl_indices {
+	int                dfa_index;
+	int                quad_index;
+	int                single_index;
+	int                match_index;
+};
+
+static void
+acl_gen_log_stats(const struct rte_acl_ctx *ctx,
+	const struct acl_node_counters *counts)
+{
+	RTE_LOG(DEBUG, ACL, "Gen phase for ACL \"%s\":\n"
+		"runtime memory footprint on socket %d:\n"
+		"single nodes/bytes used: %d/%zu\n"
+		"quad nodes/bytes used: %d/%zu\n"
+		"DFA nodes/bytes used: %d/%zu\n"
+		"match nodes/bytes used: %d/%zu\n"
+		"total: %zu bytes\n",
+		ctx->name, ctx->socket_id,
+		counts->single, counts->single * sizeof(uint64_t),
+		counts->quad, counts->quad_vectors * sizeof(uint64_t),
+		counts->dfa, counts->dfa * RTE_ACL_DFA_SIZE * sizeof(uint64_t),
+		counts->match,
+		counts->match * sizeof(struct rte_acl_match_results),
+		ctx->mem_sz);
+}
+
+/*
+*  Counts the number of groups of sequential bits that are
+*  either 0 or 1, as specified by the zero_one parameter. This is used to
+*  calculate the number of ranges in a node to see if it fits in a quad range
+*  node.
+*/
+static int
+acl_count_sequential_groups(struct rte_acl_bitset *bits, int zero_one)
+{
+	int n, ranges, last_bit;
+
+	ranges = 0;
+	last_bit = zero_one ^ 1;
+
+	for (n = QRANGE_MIN; n < UINT8_MAX + 1; n++) {
+		if (bits->bits[n / (sizeof(bits_t) * 8)] &
+				(1 << (n % (sizeof(bits_t) * 8)))) {
+			if (zero_one == 1 && last_bit != 1)
+				ranges++;
+			last_bit = 1;
+		} else {
+			if (zero_one == 0 && last_bit != 0)
+				ranges++;
+			last_bit = 0;
+		}
+	}
+	for (n = 0; n < QRANGE_MIN; n++) {
+		if (bits->bits[n / (sizeof(bits_t) * 8)] &
+				(1 << (n % (sizeof(bits_t) * 8)))) {
+			if (zero_one == 1 && last_bit != 1)
+				ranges++;
+			last_bit = 1;
+		} else {
+			if (zero_one == 0 && last_bit != 0)
+				ranges++;
+			last_bit = 0;
+		}
+	}
+
+	return (ranges);
+}
+
+/*
+ * Count number of ranges spanned by the node's pointers
+ */
+static int
+acl_count_fanout(struct rte_acl_node *node)
+{
+	uint32_t n;
+	int ranges;
+
+	if (node->fanout != 0)
+		return (node->fanout);
+
+	ranges = acl_count_sequential_groups(&node->values, 0);
+
+	for (n = 0; n < node->num_ptrs; n++) {
+		if (node->ptrs[n].ptr != NULL)
+			ranges += acl_count_sequential_groups(
+				&node->ptrs[n].values, 1);
+	}
+
+	node->fanout = ranges;
+	return (node->fanout);
+}
+
+/*
+ * Determine the type of nodes and count each type
+ */
+static int
+acl_count_trie_types(struct acl_node_counters *counts,
+	struct rte_acl_node *node, int match, int force_dfa)
+{
+	uint32_t n;
+	int num_ptrs;
+
+	/* skip if this node has been counted */
+	if (node->node_type != (uint32_t)RTE_ACL_NODE_UNDEFINED)
+		return (match);
+
+	if (node->match_flag != 0 || node->num_ptrs == 0) {
+		counts->match++;
+		if (node->match_flag == -1)
+			node->match_flag = match++;
+		node->node_type = RTE_ACL_NODE_MATCH;
+		if (counts->smallest_match > node->match_flag)
+			counts->smallest_match = node->match_flag;
+		return match;
+	}
+
+	num_ptrs = acl_count_fanout(node);
+
+	/* Force type to dfa */
+	if (force_dfa)
+		num_ptrs = RTE_ACL_DFA_SIZE;
+
+	/* determine node type based on number of ranges */
+	if (num_ptrs == 1) {
+		counts->single++;
+		node->node_type = RTE_ACL_NODE_SINGLE;
+	} else if (num_ptrs <= RTE_ACL_QUAD_MAX) {
+		counts->quad++;
+		counts->quad_vectors += node->fanout;
+		node->node_type = RTE_ACL_NODE_QRANGE;
+	} else {
+		counts->dfa++;
+		node->node_type = RTE_ACL_NODE_DFA;
+	}
+
+	/*
+	 * recursively count the types of all children
+	 */
+	for (n = 0; n < node->num_ptrs; n++) {
+		if (node->ptrs[n].ptr != NULL)
+			match = acl_count_trie_types(counts, node->ptrs[n].ptr,
+				match, 0);
+	}
+
+	return (match);
+}
+
+static void
+acl_add_ptrs(struct rte_acl_node *node, uint64_t *node_array, uint64_t no_match,
+	int resolved)
+{
+	uint32_t n, x;
+	int m, ranges, last_bit;
+	struct rte_acl_node *child;
+	struct rte_acl_bitset *bits;
+	uint64_t *node_a, index, dfa[RTE_ACL_DFA_SIZE];
+
+	ranges = 0;
+	last_bit = 0;
+
+	for (n = 0; n < RTE_DIM(dfa); n++)
+		dfa[n] = no_match;
+
+	for (x = 0; x < node->num_ptrs; x++) {
+
+		if ((child = node->ptrs[x].ptr) == NULL)
+			continue;
+
+		bits = &node->ptrs[x].values;
+		for (n = 0; n < RTE_DIM(dfa); n++) {
+
+			if (bits->bits[n / (sizeof(bits_t) * CHAR_BIT)] &
+				(1 << (n % (sizeof(bits_t) * CHAR_BIT)))) {
+
+				dfa[n] = resolved ? child->node_index : x;
+				ranges += (last_bit == 0);
+				last_bit = 1;
+			} else {
+				last_bit = 0;
+			}
+		}
+	}
+
+	/*
+	 * Rather than going from 0 to 256, the range count and
+	 * the layout are from 80-ff then 0-7f due to signed compare
+	 * for SSE (cmpgt).
+	 */
+	if (node->node_type == RTE_ACL_NODE_QRANGE) {
+
+		m = 0;
+		node_a = node_array;
+		index = dfa[QRANGE_MIN];
+		*node_a++ = index;
+
+		for (x = QRANGE_MIN + 1; x < UINT8_MAX + 1; x++) {
+			if (dfa[x] != index) {
+				index = dfa[x];
+				*node_a++ = index;
+				node->transitions[m++] = (uint8_t)(x - 1);
+			}
+		}
+
+		for (x = 0; x < INT8_MAX + 1; x++) {
+			if (dfa[x] != index) {
+				index = dfa[x];
+				*node_a++ = index;
+				node->transitions[m++] = (uint8_t)(x - 1);
+			}
+		}
+
+		/* fill unused locations with max value - nothing is greater */
+		for (; m < RTE_ACL_QUAD_SIZE; m++)
+			node->transitions[m] = INT8_MAX;
+
+		RTE_ACL_VERIFY(m <= RTE_ACL_QUAD_SIZE);
+
+	} else if (node->node_type == RTE_ACL_NODE_DFA && resolved) {
+		for (n = 0; n < RTE_DIM(dfa); n++)
+			node_array[n] = dfa[n];
+	}
+}
+
+/*
+ * Routine that allocates space for this node and recursively calls
+ * to allocate space for each child. Once all the children are allocated,
+ * then resolve all transitions for this node.
+ */
+static void
+acl_gen_node(struct rte_acl_node *node, uint64_t *node_array,
+	uint64_t no_match, struct rte_acl_indices *index, int num_categories)
+{
+	uint32_t n, *qtrp;
+	uint64_t *array_ptr;
+	struct rte_acl_match_results *match;
+
+	if (node->node_index != RTE_ACL_NODE_UNDEFINED)
+		return;
+
+	array_ptr = NULL;
+
+	switch (node->node_type) {
+	case RTE_ACL_NODE_DFA:
+		node->node_index = index->dfa_index | node->node_type;
+		array_ptr = &node_array[index->dfa_index];
+		index->dfa_index += RTE_ACL_DFA_SIZE;
+		for (n = 0; n < RTE_ACL_DFA_SIZE; n++)
+			array_ptr[n] = no_match;
+		break;
+	case RTE_ACL_NODE_SINGLE:
+		node->node_index = RTE_ACL_QUAD_SINGLE | index->single_index |
+			node->node_type;
+		array_ptr = &node_array[index->single_index];
+		index->single_index += 1;
+		array_ptr[0] = no_match;
+		break;
+	case RTE_ACL_NODE_QRANGE:
+		array_ptr = &node_array[index->quad_index];
+		acl_add_ptrs(node, array_ptr, no_match,  0);
+		qtrp = (uint32_t *)node->transitions;
+		node->node_index = qtrp[0];
+		node->node_index <<= sizeof(index->quad_index) * CHAR_BIT;
+		node->node_index |= index->quad_index | node->node_type;
+		index->quad_index += node->fanout;
+		break;
+	case RTE_ACL_NODE_MATCH:
+		match = ((struct rte_acl_match_results *)
+			(node_array + index->match_index));
+		memcpy(match + node->match_flag, node->mrt, sizeof(*node->mrt));
+		node->node_index = node->match_flag | node->node_type;
+		break;
+	case RTE_ACL_NODE_UNDEFINED:
+		RTE_ACL_VERIFY(node->node_type !=
+			(uint32_t)RTE_ACL_NODE_UNDEFINED);
+		break;
+	}
+
+	/* recursively allocate space for all children */
+	for (n = 0; n < node->num_ptrs; n++) {
+		if (node->ptrs[n].ptr != NULL)
+			acl_gen_node(node->ptrs[n].ptr,
+				node_array,
+				no_match,
+				index,
+				num_categories);
+	}
+
+	/* All children are resolved, resolve this node's pointers */
+	switch (node->node_type) {
+	case RTE_ACL_NODE_DFA:
+		acl_add_ptrs(node, array_ptr, no_match, 1);
+		break;
+	case RTE_ACL_NODE_SINGLE:
+		for (n = 0; n < node->num_ptrs; n++) {
+			if (node->ptrs[n].ptr != NULL)
+				array_ptr[0] = node->ptrs[n].ptr->node_index;
+		}
+		break;
+	case RTE_ACL_NODE_QRANGE:
+		acl_add_ptrs(node, array_ptr, no_match, 1);
+		break;
+	case RTE_ACL_NODE_MATCH:
+		break;
+	case RTE_ACL_NODE_UNDEFINED:
+		RTE_ACL_VERIFY(node->node_type !=
+			(uint32_t)RTE_ACL_NODE_UNDEFINED);
+		break;
+	}
+}
+
+static int
+acl_calc_counts_indicies(struct acl_node_counters *counts,
+	struct rte_acl_indices *indices, struct rte_acl_trie *trie,
+	struct rte_acl_bld_trie *node_bld_trie, uint32_t num_tries,
+	int match_num)
+{
+	uint32_t n;
+
+	memset(indices, 0, sizeof(*indices));
+	memset(counts, 0, sizeof(*counts));
+
+	/* Get stats on nodes */
+	for (n = 0; n < num_tries; n++) {
+		counts->smallest_match = INT32_MAX;
+		match_num = acl_count_trie_types(counts, node_bld_trie[n].trie,
+			match_num, 1);
+		trie[n].smallest = counts->smallest_match;
+	}
+
+	indices->dfa_index = RTE_ACL_DFA_SIZE + 1;
+	indices->quad_index = indices->dfa_index +
+		counts->dfa * RTE_ACL_DFA_SIZE;
+	indices->single_index = indices->quad_index + counts->quad_vectors;
+	indices->match_index = indices->single_index + counts->single + 1;
+	indices->match_index = RTE_ALIGN(indices->match_index,
+		(XMM_SIZE / sizeof(uint64_t)));
+
+	return (match_num);
+}
+
+/*
+ * Generate the runtime structure using build structure
+ */
+int
+rte_acl_gen(struct rte_acl_ctx *ctx, struct rte_acl_trie *trie,
+	struct rte_acl_bld_trie *node_bld_trie, uint32_t num_tries,
+	uint32_t num_categories, uint32_t data_index_sz, int match_num)
+{
+	void *mem;
+	size_t total_size;
+	uint64_t *node_array, no_match;
+	uint32_t n, match_index;
+	struct rte_acl_match_results *match;
+	struct acl_node_counters counts;
+	struct rte_acl_indices indices;
+
+	/* Fill counts and indicies arrays from the nodes. */
+	match_num = acl_calc_counts_indicies(&counts, &indices, trie,
+		node_bld_trie, num_tries, match_num);
+
+	/* Allocate runtime memory (align to cache boundary) */
+	total_size = RTE_ALIGN(data_index_sz, CACHE_LINE_SIZE) +
+		indices.match_index * sizeof(uint64_t) +
+		(match_num + 2) * sizeof(struct rte_acl_match_results) +
+		XMM_SIZE;
+
+	if ((mem =  rte_zmalloc_socket(ctx->name, total_size, CACHE_LINE_SIZE,
+			ctx->socket_id)) == NULL) {
+		RTE_LOG(ERR, ACL,
+			"allocation of %zu bytes on socket %d for %s failed\n",
+			total_size, ctx->socket_id, ctx->name);
+		return (-ENOMEM);
+	}
+
+	/* Fill the runtime structure */
+	match_index = indices.match_index;
+	node_array = (uint64_t *)((uintptr_t)mem +
+		RTE_ALIGN(data_index_sz, CACHE_LINE_SIZE));
+
+	/*
+	 * Setup the NOMATCH node (a SINGLE at the
+	 * highest index, that points to itself)
+	 */
+
+	node_array[RTE_ACL_DFA_SIZE] = RTE_ACL_DFA_SIZE | RTE_ACL_NODE_SINGLE;
+	no_match = RTE_ACL_NODE_MATCH;
+
+	for (n = 0; n < RTE_ACL_DFA_SIZE; n++)
+		node_array[n] = no_match;
+
+	match = ((struct rte_acl_match_results *)(node_array + match_index));
+	memset(match, 0, sizeof(*match));
+
+	for (n = 0; n < num_tries; n++) {
+
+		acl_gen_node(node_bld_trie[n].trie, node_array, no_match,
+			&indices, num_categories);
+
+		if (node_bld_trie[n].trie->node_index == no_match)
+			trie[n].root_index = 0;
+		else
+			trie[n].root_index = node_bld_trie[n].trie->node_index;
+	}
+
+	ctx->mem = mem;
+	ctx->mem_sz = total_size;
+	ctx->data_indexes = mem;
+	ctx->num_tries = num_tries;
+	ctx->num_categories = num_categories;
+	ctx->match_index = match_index;
+	ctx->no_match = no_match;
+	ctx->idle = node_array[RTE_ACL_DFA_SIZE];
+	ctx->trans_table = node_array;
+	memcpy(ctx->trie, trie, sizeof(ctx->trie));
+
+	acl_gen_log_stats(ctx, &counts);
+	return (0);
+}
diff --git a/lib/librte_acl/acl_run.c b/lib/librte_acl/acl_run.c
new file mode 100644
index 0000000..d08d7ea
--- /dev/null
+++ b/lib/librte_acl/acl_run.c
@@ -0,0 +1,927 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <rte_acl.h>
+#include "acl_vect.h"
+#include "acl.h"
+
+#define MAX_SEARCHES_SSE8	8
+#define MAX_SEARCHES_SSE4	4
+#define MAX_SEARCHES_SSE2	2
+#define MAX_SEARCHES_SCALAR	2
+
+#define GET_NEXT_4BYTES(prm, idx)	\
+	(*((const int32_t *)((prm)[(idx)].data + *(prm)[idx].data_index++)))
+
+
+#define RTE_ACL_NODE_INDEX	((uint32_t)~RTE_ACL_NODE_TYPE)
+
+#define	SCALAR_QRANGE_MULT	0x01010101
+#define	SCALAR_QRANGE_MASK	0x7f7f7f7f
+#define	SCALAR_QRANGE_MIN	0x80808080
+
+enum {
+	SHUFFLE32_SLOT1 = 0xe5,
+	SHUFFLE32_SLOT2 = 0xe6,
+	SHUFFLE32_SLOT3 = 0xe7,
+	SHUFFLE32_SWAP64 = 0x4e,
+};
+
+/*
+ * Structure to manage N parallel trie traversals.
+ * The runtime trie traversal routines can process 8, 4, or 2 tries
+ * in parallel. Each packet may require multiple trie traversals (up to 4).
+ * This structure is used to fill the slots (0 to n-1) for parallel processing
+ * with the trie traversals needed for each packet.
+ */
+struct acl_flow_data {
+	uint32_t            num_packets;
+	/* number of packets processed */
+	uint32_t            started;
+	/* number of trie traversals in progress */
+	uint32_t            trie;
+	/* current trie index (0 to N-1) */
+	uint32_t            cmplt_size;
+	uint32_t            total_packets;
+	uint32_t            categories;
+	/* number of result categories per packet. */
+	/* maximum number of packets to process */
+	const uint64_t     *trans;
+	const uint8_t     **data;
+	uint32_t           *results;
+	struct completion  *last_cmplt;
+	struct completion  *cmplt_array;
+};
+
+/*
+ * Structure to maintain running results for
+ * a single packet (up to 4 tries).
+ */
+struct completion {
+	uint32_t *results;                          /* running results. */
+	int32_t   priority[RTE_ACL_MAX_CATEGORIES]; /* running priorities. */
+	uint32_t  count;                            /* num of remaining tries */
+	/* true for allocated struct */
+} __attribute__((aligned(XMM_SIZE)));
+
+/*
+ * One parms structure for each slot in the search engine.
+ */
+struct parms {
+	const uint8_t              *data;
+	/* input data for this packet */
+	const uint32_t             *data_index;
+	/* data indirection for this trie */
+	struct completion          *cmplt;
+	/* completion data for this packet */
+};
+
+/*
+ * Define an global idle node for unused engine slots
+ */
+static const uint32_t idle[UINT8_MAX + 1];
+
+static const rte_xmm_t mm_type_quad_range = {
+	.u32 = {
+		RTE_ACL_NODE_QRANGE,
+		RTE_ACL_NODE_QRANGE,
+		RTE_ACL_NODE_QRANGE,
+		RTE_ACL_NODE_QRANGE,
+	},
+};
+
+static const rte_xmm_t mm_type_quad_range64 = {
+	.u32 = {
+		RTE_ACL_NODE_QRANGE,
+		RTE_ACL_NODE_QRANGE,
+		0,
+		0,
+	},
+};
+
+static const rte_xmm_t mm_shuffle_input = {
+	.u32 = {0x00000000, 0x04040404, 0x08080808, 0x0c0c0c0c},
+};
+
+static const rte_xmm_t mm_shuffle_input64 = {
+	.u32 = {0x00000000, 0x04040404, 0x80808080, 0x80808080},
+};
+
+static const rte_xmm_t mm_ones_16 = {
+	.u16 = {1, 1, 1, 1, 1, 1, 1, 1},
+};
+
+static const rte_xmm_t mm_bytes = {
+	.u32 = {UINT8_MAX, UINT8_MAX, UINT8_MAX, UINT8_MAX},
+};
+
+static const rte_xmm_t mm_bytes64 = {
+	.u32 = {UINT8_MAX, UINT8_MAX, 0, 0},
+};
+
+static const rte_xmm_t mm_match_mask = {
+	.u32 = {
+		RTE_ACL_NODE_MATCH,
+		RTE_ACL_NODE_MATCH,
+		RTE_ACL_NODE_MATCH,
+		RTE_ACL_NODE_MATCH,
+	},
+};
+
+static const rte_xmm_t mm_match_mask64 = {
+	.u32 = {
+		RTE_ACL_NODE_MATCH,
+		0,
+		RTE_ACL_NODE_MATCH,
+		0,
+	},
+};
+
+static const rte_xmm_t mm_index_mask = {
+	.u32 = {
+		RTE_ACL_NODE_INDEX,
+		RTE_ACL_NODE_INDEX,
+		RTE_ACL_NODE_INDEX,
+		RTE_ACL_NODE_INDEX,
+	},
+};
+
+static const rte_xmm_t mm_index_mask64 = {
+	.u32 = {
+		RTE_ACL_NODE_INDEX,
+		RTE_ACL_NODE_INDEX,
+		0,
+		0,
+	},
+};
+
+/*
+ * Allocate a completion structure to manage the tries for a packet.
+ */
+static inline struct completion *
+alloc_completion(struct completion *p, uint32_t size, uint32_t tries,
+	uint32_t *results)
+{
+	uint32_t n;
+
+	for (n = 0; n < size; n++) {
+
+		if (p[n].count == 0) {
+
+			/* mark as allocated and set number of tries. */
+			p[n].count = tries;
+			p[n].results = results;
+			return &(p[n]);
+		}
+	}
+
+	/* should never get here */
+	return (NULL);
+}
+
+/*
+ * Resolve priority for a single result trie.
+ */
+static inline void
+resolve_single_priority(uint64_t transition, int n,
+	const struct rte_acl_ctx *ctx, struct parms *parms,
+	const struct rte_acl_match_results *p)
+{
+	if (parms[n].cmplt->count == ctx->num_tries ||
+			parms[n].cmplt->priority[0] <=
+			p[transition].priority[0]) {
+
+		parms[n].cmplt->priority[0] = p[transition].priority[0];
+		parms[n].cmplt->results[0] = p[transition].results[0];
+	}
+
+	parms[n].cmplt->count--;
+}
+
+/*
+ * Resolve priority for multiple results. This consists comparing
+ * the priority of the current traversal with the running set of
+ * results for the packet. For each result, keep a running array of
+ * the result (rule number) and its priority for each category.
+ */
+static inline void
+resolve_priority(uint64_t transition, int n, const struct rte_acl_ctx *ctx,
+	struct parms *parms, const struct rte_acl_match_results *p,
+	uint32_t categories)
+{
+	uint32_t x;
+	xmm_t results, priority, results1, priority1, selector;
+	xmm_t *saved_results, *saved_priority;
+
+	for (x = 0; x < categories; x += RTE_ACL_RESULTS_MULTIPLIER) {
+
+		saved_results = (xmm_t *)(&parms[n].cmplt->results[x]);
+		saved_priority =
+			(xmm_t *)(&parms[n].cmplt->priority[x]);
+
+		/* get results and priorities for completed trie */
+		results = MM_LOADU((const xmm_t *)&p[transition].results[x]);
+		priority = MM_LOADU((const xmm_t *)&p[transition].priority[x]);
+
+		/* if this is not the first completed trie */
+		if (parms[n].cmplt->count != ctx->num_tries) {
+
+			/* get running best results and their priorities */
+			results1 = MM_LOADU(saved_results);
+			priority1 = MM_LOADU(saved_priority);
+
+			/* select results that are highest priority */
+			selector = MM_CMPGT32(priority1, priority);
+			results = MM_BLENDV8(results, results1, selector);
+			priority = MM_BLENDV8(priority, priority1, selector);
+		}
+
+		/* save running best results and their priorities */
+		MM_STOREU(saved_results, results);
+		MM_STOREU(saved_priority, priority);
+	}
+
+	/* Count down completed tries for this search request */
+	parms[n].cmplt->count--;
+}
+
+/*
+ * Routine to fill a slot in the parallel trie traversal array (parms) from
+ * the list of packets (flows).
+ */
+static inline uint64_t
+acl_start_next_trie(struct acl_flow_data *flows, struct parms *parms, int n,
+	const struct rte_acl_ctx *ctx)
+{
+	uint64_t transition;
+
+	/* if there are any more packets to process */
+	if (flows->num_packets < flows->total_packets) {
+		parms[n].data = flows->data[flows->num_packets];
+		parms[n].data_index = ctx->trie[flows->trie].data_index;
+
+		/* if this is the first trie for this packet */
+		if (flows->trie == 0) {
+			flows->last_cmplt = alloc_completion(flows->cmplt_array,
+				flows->cmplt_size, ctx->num_tries,
+				flows->results +
+				flows->num_packets * flows->categories);
+		}
+
+		/* set completion parameters and starting index for this slot */
+		parms[n].cmplt = flows->last_cmplt;
+		transition =
+			flows->trans[parms[n].data[*parms[n].data_index++] +
+			ctx->trie[flows->trie].root_index];
+
+		/*
+		 * if this is the last trie for this packet,
+		 * then setup next packet.
+		 */
+		flows->trie++;
+		if (flows->trie >= ctx->num_tries) {
+			flows->trie = 0;
+			flows->num_packets++;
+		}
+
+		/* keep track of number of active trie traversals */
+		flows->started++;
+
+	/* no more tries to process, set slot to an idle position */
+	} else {
+		transition = ctx->idle;
+		parms[n].data = (const uint8_t *)idle;
+		parms[n].data_index = idle;
+	}
+	return transition;
+}
+
+/*
+ * Detect matches. If a match node transition is found, then this trie
+ * traversal is complete and fill the slot with the next trie
+ * to be processed.
+ */
+static inline uint64_t
+acl_match_check_transition(uint64_t transition, int slot,
+	const struct rte_acl_ctx *ctx, struct parms *parms,
+	struct acl_flow_data *flows)
+{
+	const struct rte_acl_match_results *p;
+
+	p = (const struct rte_acl_match_results *)
+		(flows->trans + ctx->match_index);
+
+	if (transition & RTE_ACL_NODE_MATCH) {
+
+		/* Remove flags from index and decrement active traversals */
+		transition &= RTE_ACL_NODE_INDEX;
+		flows->started--;
+
+		/* Resolve priorities for this trie and running results */
+		if (flows->categories == 1)
+			resolve_single_priority(transition, slot, ctx,
+				parms, p);
+		else
+			resolve_priority(transition, slot, ctx, parms, p,
+				flows->categories);
+
+		/* Fill the slot with the next trie or idle trie */
+		transition = acl_start_next_trie(flows, parms, slot, ctx);
+
+	} else if (transition == ctx->idle) {
+		/* reset indirection table for idle slots */
+		parms[slot].data_index = idle;
+	}
+
+	return transition;
+}
+
+/*
+ * Extract transitions from an XMM register and check for any matches
+ */
+static void
+acl_process_matches(xmm_t *indicies, int slot, const struct rte_acl_ctx *ctx,
+	struct parms *parms, struct acl_flow_data *flows)
+{
+	uint64_t transition1, transition2;
+
+	/* extract transition from low 64 bits. */
+	transition1 = MM_CVT64(*indicies);
+
+	/* extract transition from high 64 bits. */
+	*indicies = MM_SHUFFLE32(*indicies, SHUFFLE32_SWAP64);
+	transition2 = MM_CVT64(*indicies);
+
+	transition1 = acl_match_check_transition(transition1, slot, ctx,
+		parms, flows);
+	transition2 = acl_match_check_transition(transition2, slot + 1, ctx,
+		parms, flows);
+
+	/* update indicies with new transitions. */
+	*indicies = MM_SET64(transition2, transition1);
+}
+
+/*
+ * Check for a match in 2 transitions (contained in SSE register)
+ */
+static inline void
+acl_match_check_x2(int slot, const struct rte_acl_ctx *ctx, struct parms *parms,
+	struct acl_flow_data *flows, xmm_t *indicies, xmm_t match_mask)
+{
+	xmm_t temp;
+
+	temp = MM_AND(match_mask, *indicies);
+	if (!MM_TESTZ(temp, temp)) {
+		acl_process_matches(indicies, slot, ctx, parms, flows);
+	}
+}
+
+/*
+ * Check for any match in 4 transitions (contained in 2 SSE registers)
+ */
+static inline void
+acl_match_check_x4(int slot, const struct rte_acl_ctx *ctx, struct parms *parms,
+	struct acl_flow_data *flows, xmm_t *indicies1, xmm_t *indicies2,
+	xmm_t match_mask)
+{
+	xmm_t temp;
+
+	/* put low 32 bits of each transition into one register */
+	temp = (xmm_t)MM_SHUFFLEPS((__m128)*indicies1, (__m128)*indicies2,
+		0x88);
+
+	/* test for match node */
+	temp = MM_AND(match_mask, temp);
+	if (!MM_TESTZ(temp, temp)) {
+		acl_process_matches(indicies1, slot, ctx, parms, flows);
+		acl_process_matches(indicies2, slot + 2, ctx, parms, flows);
+	}
+}
+
+/*
+ * Calculate the address of the next transition for
+ * all types of nodes. Note that only DFA nodes and range
+ * nodes actually transition to another node. Match
+ * nodes don't move.
+ */
+static inline xmm_t
+acl_calc_addr(xmm_t index_mask, xmm_t next_input, xmm_t shuffle_input,
+	xmm_t ones_16, xmm_t bytes, xmm_t type_quad_range,
+	xmm_t *indicies1, xmm_t *indicies2)
+{
+	xmm_t addr, node_types, temp;
+
+	/*
+	 * Note that no transition is done for a match
+	 * node and therefore a stream freezes when
+	 * it reaches a match.
+	 */
+
+	/* Shuffle low 32 into temp and high 32 into indicies2 */
+	temp = (xmm_t)MM_SHUFFLEPS((__m128)*indicies1, (__m128)*indicies2,
+		0x88);
+	*indicies2 = (xmm_t)MM_SHUFFLEPS((__m128)*indicies1,
+		(__m128)*indicies2, 0xdd);
+
+	/* Calc node type and node addr */
+	node_types = MM_ANDNOT(index_mask, temp);
+	addr = MM_AND(index_mask, temp);
+
+	/*
+	 * Calc addr for DFAs - addr = dfa_index + input_byte
+	 */
+
+	/* mask for DFA type (0) nodes */
+	temp = MM_CMPEQ32(node_types, MM_XOR(node_types, node_types));
+
+	/* add input byte to DFA position */
+	temp = MM_AND(temp, bytes);
+	temp = MM_AND(temp, next_input);
+	addr = MM_ADD32(addr, temp);
+
+	/*
+	 * Calc addr for Range nodes -> range_index + range(input)
+	 */
+	node_types = MM_CMPEQ32(node_types, type_quad_range);
+
+	/*
+	 * Calculate number of range boundaries that are less than the
+	 * input value. Range boundaries for each node are in signed 8 bit,
+	 * ordered from -128 to 127 in the indicies2 register.
+	 * This is effectively a popcnt of bytes that are greater than the
+	 * input byte.
+	 */
+
+	/* shuffle input byte to all 4 positions of 32 bit value */
+	temp = MM_SHUFFLE8(next_input, shuffle_input);
+
+	/* check ranges */
+	temp = MM_CMPGT8(temp, *indicies2);
+
+	/* convert -1 to 1 (bytes greater than input byte */
+	temp = MM_SIGN8(temp, temp);
+
+	/* horizontal add pairs of bytes into words */
+	temp = MM_MADD8(temp, temp);
+
+	/* horizontal add pairs of words into dwords */
+	temp = MM_MADD16(temp, ones_16);
+
+	/* mask to range type nodes */
+	temp = MM_AND(temp, node_types);
+
+	/* add index into node position */
+	return (MM_ADD32(addr, temp));
+}
+
+/*
+ * Process 4 transitions (in 2 SIMD registers) in parallel
+ */
+static inline xmm_t
+transition4(xmm_t index_mask, xmm_t next_input, xmm_t shuffle_input,
+	xmm_t ones_16, xmm_t bytes, xmm_t type_quad_range,
+	const uint64_t *trans, xmm_t *indicies1, xmm_t *indicies2)
+{
+	xmm_t addr;
+	uint64_t trans0, trans2;
+
+	 /* Calculate the address (array index) for all 4 transitions. */
+
+	addr = acl_calc_addr(index_mask, next_input, shuffle_input, ones_16,
+		bytes, type_quad_range, indicies1, indicies2);
+
+	 /* Gather 64 bit transitions and pack back into 2 registers. */
+
+	trans0 = trans[MM_CVT32(addr)];
+
+	/* get slot 2 */
+
+	/* {x0, x1, x2, x3} -> {x2, x1, x2, x3} */
+	addr = MM_SHUFFLE32(addr, SHUFFLE32_SLOT2);
+	trans2 = trans[MM_CVT32(addr)];
+
+	/* get slot 1 */
+
+	/* {x2, x1, x2, x3} -> {x1, x1, x2, x3} */
+	addr = MM_SHUFFLE32(addr, SHUFFLE32_SLOT1);
+	*indicies1 = MM_SET64(trans[MM_CVT32(addr)], trans0);
+
+	/* get slot 3 */
+
+	/* {x1, x1, x2, x3} -> {x3, x1, x2, x3} */
+	addr = MM_SHUFFLE32(addr, SHUFFLE32_SLOT3);
+	*indicies2 = MM_SET64(trans[MM_CVT32(addr)], trans2);
+
+	return (MM_SRL32(next_input, 8));
+}
+
+static inline void
+acl_set_flow(struct acl_flow_data *flows, struct completion *cmplt,
+	uint32_t cmplt_size, const uint8_t **data, uint32_t *results,
+	uint32_t data_num, uint32_t categories, const uint64_t *trans)
+{
+	flows->num_packets = 0;
+	flows->started = 0;
+	flows->trie = 0;
+	flows->last_cmplt = NULL;
+	flows->cmplt_array = cmplt;
+	flows->total_packets = data_num;
+	flows->categories = categories;
+	flows->cmplt_size = cmplt_size;
+	flows->data = data;
+	flows->results = results;
+	flows->trans = trans;
+}
+
+/*
+ * Execute trie traversal with 8 traversals in parallel
+ */
+static inline void
+search_sse_8(const struct rte_acl_ctx *ctx, const uint8_t **data,
+	uint32_t *results, uint32_t total_packets, uint32_t categories)
+{
+	int n;
+	struct acl_flow_data flows;
+	uint64_t index_array[MAX_SEARCHES_SSE8];
+	struct completion cmplt[MAX_SEARCHES_SSE8];
+	struct parms parms[MAX_SEARCHES_SSE8];
+	xmm_t input0, input1;
+	xmm_t indicies1, indicies2, indicies3, indicies4;
+
+	acl_set_flow(&flows, cmplt, RTE_DIM(cmplt), data, results,
+		total_packets, categories, ctx->trans_table);
+
+	for (n = 0; n < MAX_SEARCHES_SSE8; n++) {
+		cmplt[n].count = 0;
+		index_array[n] = acl_start_next_trie(&flows, parms, n, ctx);
+	}
+
+	/*
+	 * indicies1 contains index_array[0,1]
+	 * indicies2 contains index_array[2,3]
+	 * indicies3 contains index_array[4,5]
+	 * indicies4 contains index_array[6,7]
+	 */
+
+	indicies1 = MM_LOADU((xmm_t *) &index_array[0]);
+	indicies2 = MM_LOADU((xmm_t *) &index_array[2]);
+
+	indicies3 = MM_LOADU((xmm_t *) &index_array[4]);
+	indicies4 = MM_LOADU((xmm_t *) &index_array[6]);
+
+	while (flows.started > 0) {
+
+		/* Gather 4 bytes of input data for each stream. */
+		input0 = MM_INSERT32(mm_ones_16.m, GET_NEXT_4BYTES(parms, 0),
+			0);
+		input1 = MM_INSERT32(mm_ones_16.m, GET_NEXT_4BYTES(parms, 4),
+			0);
+
+		input0 = MM_INSERT32(input0, GET_NEXT_4BYTES(parms, 1), 1);
+		input1 = MM_INSERT32(input1, GET_NEXT_4BYTES(parms, 5), 1);
+
+		input0 = MM_INSERT32(input0, GET_NEXT_4BYTES(parms, 2), 2);
+		input1 = MM_INSERT32(input1, GET_NEXT_4BYTES(parms, 6), 2);
+
+		input0 = MM_INSERT32(input0, GET_NEXT_4BYTES(parms, 3), 3);
+		input1 = MM_INSERT32(input1, GET_NEXT_4BYTES(parms, 7), 3);
+
+		 /* Process the 4 bytes of input on each stream. */
+
+		input0 = transition4(mm_index_mask.m, input0,
+			mm_shuffle_input.m, mm_ones_16.m,
+			mm_bytes.m, mm_type_quad_range.m,
+			flows.trans, &indicies1, &indicies2);
+
+		input1 = transition4(mm_index_mask.m, input1,
+			mm_shuffle_input.m, mm_ones_16.m,
+			mm_bytes.m, mm_type_quad_range.m,
+			flows.trans, &indicies3, &indicies4);
+
+		input0 = transition4(mm_index_mask.m, input0,
+			mm_shuffle_input.m, mm_ones_16.m,
+			mm_bytes.m, mm_type_quad_range.m,
+			flows.trans, &indicies1, &indicies2);
+
+		input1 = transition4(mm_index_mask.m, input1,
+			mm_shuffle_input.m, mm_ones_16.m,
+			mm_bytes.m, mm_type_quad_range.m,
+			flows.trans, &indicies3, &indicies4);
+
+		input0 = transition4(mm_index_mask.m, input0,
+			mm_shuffle_input.m, mm_ones_16.m,
+			mm_bytes.m, mm_type_quad_range.m,
+			flows.trans, &indicies1, &indicies2);
+
+		input1 = transition4(mm_index_mask.m, input1,
+			mm_shuffle_input.m, mm_ones_16.m,
+			mm_bytes.m, mm_type_quad_range.m,
+			flows.trans, &indicies3, &indicies4);
+
+		input0 = transition4(mm_index_mask.m, input0,
+			mm_shuffle_input.m, mm_ones_16.m,
+			mm_bytes.m, mm_type_quad_range.m,
+			flows.trans, &indicies1, &indicies2);
+
+		input1 = transition4(mm_index_mask.m, input1,
+			mm_shuffle_input.m, mm_ones_16.m,
+			mm_bytes.m, mm_type_quad_range.m,
+			flows.trans, &indicies3, &indicies4);
+
+		 /* Check for any matches. */
+
+		acl_match_check_x4(0, ctx, parms, &flows,
+			&indicies1, &indicies2, mm_match_mask.m);
+		acl_match_check_x4(4, ctx, parms, &flows,
+			&indicies3, &indicies4, mm_match_mask.m);
+	}
+}
+
+/*
+ * Execute trie traversal with 4 traversals in parallel
+ */
+static inline void
+search_sse_4(const struct rte_acl_ctx *ctx, const uint8_t **data,
+	 uint32_t *results, int total_packets, uint32_t categories)
+{
+	int n;
+	struct acl_flow_data flows;
+	uint64_t index_array[MAX_SEARCHES_SSE4];
+	struct completion cmplt[MAX_SEARCHES_SSE4];
+	struct parms parms[MAX_SEARCHES_SSE4];
+	xmm_t input, indicies1, indicies2;
+
+	acl_set_flow(&flows, cmplt, RTE_DIM(cmplt), data, results,
+		total_packets, categories, ctx->trans_table);
+
+	for (n = 0; n < MAX_SEARCHES_SSE4; n++) {
+		cmplt[n].count = 0;
+		index_array[n] = acl_start_next_trie(&flows, parms, n, ctx);
+	}
+
+	indicies1 = MM_LOADU((xmm_t *) &index_array[0]);
+	indicies2 = MM_LOADU((xmm_t *) &index_array[2]);
+
+	while (flows.started > 0) {
+
+		/* Gather 4 bytes of input data for each stream. */
+		input = MM_INSERT32(mm_ones_16.m, GET_NEXT_4BYTES(parms, 0), 0);
+		input = MM_INSERT32(input, GET_NEXT_4BYTES(parms, 1), 1);
+		input = MM_INSERT32(input, GET_NEXT_4BYTES(parms, 2), 2);
+		input = MM_INSERT32(input, GET_NEXT_4BYTES(parms, 3), 3);
+
+		/* Process the 4 bytes of input on each stream. */
+		input = transition4(mm_index_mask.m, input,
+			mm_shuffle_input.m, mm_ones_16.m,
+			mm_bytes.m, mm_type_quad_range.m,
+			flows.trans, &indicies1, &indicies2);
+
+		 input = transition4(mm_index_mask.m, input,
+			mm_shuffle_input.m, mm_ones_16.m,
+			mm_bytes.m, mm_type_quad_range.m,
+			flows.trans, &indicies1, &indicies2);
+
+		 input = transition4(mm_index_mask.m, input,
+			mm_shuffle_input.m, mm_ones_16.m,
+			mm_bytes.m, mm_type_quad_range.m,
+			flows.trans, &indicies1, &indicies2);
+
+		 input = transition4(mm_index_mask.m, input,
+			mm_shuffle_input.m, mm_ones_16.m,
+			mm_bytes.m, mm_type_quad_range.m,
+			flows.trans, &indicies1, &indicies2);
+
+		/* Check for any matches. */
+
+		acl_match_check_x4(0, ctx, parms, &flows,
+			&indicies1, &indicies2, mm_match_mask.m);
+	}
+}
+
+static inline xmm_t
+transition2(xmm_t index_mask, xmm_t next_input, xmm_t shuffle_input,
+	xmm_t ones_16, xmm_t bytes, xmm_t type_quad_range,
+	const uint64_t *trans, xmm_t *indicies1)
+{
+	uint64_t t;
+	xmm_t addr, indicies2;
+
+	indicies2 = MM_XOR(ones_16, ones_16);
+
+	addr = acl_calc_addr(index_mask, next_input, shuffle_input, ones_16,
+		bytes, type_quad_range, indicies1, &indicies2);
+
+	/* Gather 64 bit transitions and pack 2 per register. */
+
+	t = trans[MM_CVT32(addr)];
+
+	/* get slot 1 */
+	addr = MM_SHUFFLE32(addr, SHUFFLE32_SLOT1);
+	*indicies1 = MM_SET64(trans[MM_CVT32(addr)], t);
+
+	return (MM_SRL32(next_input, 8));
+}
+
+/*
+ * Execute trie traversal with 2 traversals in parallel.
+ */
+static inline void
+search_sse_2(const struct rte_acl_ctx *ctx, const uint8_t **data,
+	uint32_t *results, uint32_t total_packets, uint32_t categories)
+{
+	int n;
+	struct acl_flow_data flows;
+	uint64_t index_array[MAX_SEARCHES_SSE2];
+	struct completion cmplt[MAX_SEARCHES_SSE2];
+	struct parms parms[MAX_SEARCHES_SSE2];
+	xmm_t input, indicies;
+
+	acl_set_flow(&flows, cmplt, RTE_DIM(cmplt), data, results,
+		total_packets, categories, ctx->trans_table);
+
+	for (n = 0; n < MAX_SEARCHES_SSE2; n++) {
+		cmplt[n].count = 0;
+		index_array[n] = acl_start_next_trie(&flows, parms, n, ctx);
+	}
+
+	indicies = MM_LOADU((xmm_t *) &index_array[0]);
+
+	while (flows.started > 0) {
+
+		/* Gather 4 bytes of input data for each stream. */
+		input = MM_INSERT32(mm_ones_16.m, GET_NEXT_4BYTES(parms, 0), 0);
+		input = MM_INSERT32(input, GET_NEXT_4BYTES(parms, 1), 1);
+
+		/* Process the 4 bytes of input on each stream. */
+
+		input = transition2(mm_index_mask64.m, input,
+			mm_shuffle_input64.m, mm_ones_16.m,
+			mm_bytes64.m, mm_type_quad_range64.m,
+			flows.trans, &indicies);
+
+		input = transition2(mm_index_mask64.m, input,
+			mm_shuffle_input64.m, mm_ones_16.m,
+			mm_bytes64.m, mm_type_quad_range64.m,
+			flows.trans, &indicies);
+
+		input = transition2(mm_index_mask64.m, input,
+			mm_shuffle_input64.m, mm_ones_16.m,
+			mm_bytes64.m, mm_type_quad_range64.m,
+			flows.trans, &indicies);
+
+		input = transition2(mm_index_mask64.m, input,
+			mm_shuffle_input64.m, mm_ones_16.m,
+			mm_bytes64.m, mm_type_quad_range64.m,
+			flows.trans, &indicies);
+
+		/* Check for any matches. */
+		acl_match_check_x2(0, ctx, parms, &flows, &indicies,
+			mm_match_mask64.m);
+	}
+}
+
+/*
+ * When processing the transition, rather than using if/else
+ * construct, the offset is calculated for DFA and QRANGE and
+ * then conditionally added to the address based on node type.
+ * This is done to avoid branch mis-predictions. Since the
+ * offset is rather simple calculation it is more efficient
+ * to do the calculation and do a condition move rather than
+ * a conditional branch to determine which calculation to do.
+ */
+static inline uint32_t
+scan_forward(uint32_t input, uint32_t max)
+{
+	return ((input == 0) ? max : rte_bsf32(input));
+}
+
+static inline uint64_t
+scalar_transition(const uint64_t *trans_table, uint64_t transition,
+	uint8_t input)
+{
+	uint32_t addr, index, ranges, x, a, b, c;
+
+	/* break transition into component parts */
+	ranges = transition >> (sizeof(index) * CHAR_BIT);
+
+	/* calc address for a QRANGE node */
+	c = input * SCALAR_QRANGE_MULT;
+	a = ranges | SCALAR_QRANGE_MIN;
+	index = transition & ~RTE_ACL_NODE_INDEX;
+	a -= (c & SCALAR_QRANGE_MASK);
+	b = c & SCALAR_QRANGE_MIN;
+	addr = transition ^ index;
+	a &= SCALAR_QRANGE_MIN;
+	a ^= (ranges ^ b) & (a ^ b);
+	x = scan_forward(a, 32) >> 3;
+	addr += (index == RTE_ACL_NODE_DFA) ? input : x;
+
+	/* pickup next transition */
+	transition = *(trans_table + addr);
+	return (transition);
+}
+
+int
+rte_acl_classify_scalar(const struct rte_acl_ctx *ctx, const uint8_t **data,
+	uint32_t *results, uint32_t num, uint32_t categories)
+{
+	int n;
+	uint64_t transition0, transition1;
+	uint32_t input0, input1;
+	struct acl_flow_data flows;
+	uint64_t index_array[MAX_SEARCHES_SCALAR];
+	struct completion cmplt[MAX_SEARCHES_SCALAR];
+	struct parms parms[MAX_SEARCHES_SCALAR];
+
+	if (categories != 1 &&
+		((RTE_ACL_RESULTS_MULTIPLIER - 1) & categories) != 0)
+		return (-EINVAL);
+
+	acl_set_flow(&flows, cmplt, RTE_DIM(cmplt), data, results, num,
+		categories, ctx->trans_table);
+
+	for (n = 0; n < MAX_SEARCHES_SCALAR; n++) {
+		cmplt[n].count = 0;
+		index_array[n] = acl_start_next_trie(&flows, parms, n, ctx);
+	}
+
+	transition0 = index_array[0];
+	transition1 = index_array[1];
+
+	while (flows.started > 0) {
+
+		input0 = GET_NEXT_4BYTES(parms, 0);
+		input1 = GET_NEXT_4BYTES(parms, 1);
+
+		for (n = 0; n < 4; n++) {
+			if (likely((transition0 & RTE_ACL_NODE_MATCH) == 0))
+				transition0 = scalar_transition(flows.trans,
+					transition0, (uint8_t)input0);
+
+			input0 >>= CHAR_BIT;
+
+			if (likely((transition1 & RTE_ACL_NODE_MATCH) == 0))
+				transition1 = scalar_transition(flows.trans,
+					transition1, (uint8_t)input1);
+
+			input1 >>= CHAR_BIT;
+
+		}
+		if ((transition0 | transition1) & RTE_ACL_NODE_MATCH) {
+			transition0 = acl_match_check_transition(transition0,
+				0, ctx, parms, &flows);
+			transition1 = acl_match_check_transition(transition1,
+				1, ctx, parms, &flows);
+
+		}
+	}
+	return (0);
+}
+
+int
+rte_acl_classify(const struct rte_acl_ctx *ctx, const uint8_t **data,
+	uint32_t *results, uint32_t num, uint32_t categories)
+{
+	if (categories != 1 &&
+		((RTE_ACL_RESULTS_MULTIPLIER - 1) & categories) != 0)
+		return (-EINVAL);
+
+	if (likely(num >= MAX_SEARCHES_SSE8))
+		search_sse_8(ctx, data, results, num, categories);
+	else if (num >= MAX_SEARCHES_SSE4)
+		search_sse_4(ctx, data, results, num, categories);
+	else
+		search_sse_2(ctx, data, results, num, categories);
+
+	return (0);
+}
diff --git a/lib/librte_acl/acl_vect.h b/lib/librte_acl/acl_vect.h
new file mode 100644
index 0000000..d08c45d
--- /dev/null
+++ b/lib/librte_acl/acl_vect.h
@@ -0,0 +1,129 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_ACL_VECT_H_
+#define _RTE_ACL_VECT_H_
+
+/**
+ * @file
+ *
+ * RTE ACL SSE/AVX related header.
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define	MM_ADD16(a, b)		_mm_add_epi16(a,b)
+#define	MM_ADD32(a, b)		_mm_add_epi32(a, b)
+#define	MM_ALIGNR8(a, b, c)	_mm_alignr_epi8(a, b, c)
+#define	MM_AND(a, b)		_mm_and_si128(a, b)
+#define MM_ANDNOT(a, b)		_mm_andnot_si128(a, b)
+#define MM_BLENDV8(a, b, c)	_mm_blendv_epi8(a, b, c)
+#define MM_CMPEQ16(a, b)	_mm_cmpeq_epi16(a, b)
+#define MM_CMPEQ32(a, b)	_mm_cmpeq_epi32(a, b)
+#define	MM_CMPEQ8(a, b)		_mm_cmpeq_epi8(a, b)
+#define MM_CMPGT32(a, b) 	_mm_cmpgt_epi32(a, b)
+#define MM_CMPGT8(a, b) 	_mm_cmpgt_epi8(a, b)
+#define MM_CVT(a)		_mm_cvtsi32_si128(a)
+#define	MM_CVT32(a)		_mm_cvtsi128_si32(a)
+#define MM_CVTU32(a)		_mm_cvtsi32_si128(a)
+#define	MM_INSERT16(a, c, b)	_mm_insert_epi16(a, c, b)
+#define	MM_INSERT32(a, c, b)	_mm_insert_epi32(a, c, b)
+#define	MM_LOAD(a)		_mm_load_si128(a)
+#define	MM_LOADH_PI(a, b)	_mm_loadh_pi(a, b)
+#define	MM_LOADU(a)		_mm_loadu_si128(a)
+#define	MM_MADD16(a, b)		_mm_madd_epi16(a, b)
+#define	MM_MADD8(a, b)		_mm_maddubs_epi16(a, b)
+#define	MM_MOVEMASK8(a)		_mm_movemask_epi8(a)
+#define MM_OR(a, b)		_mm_or_si128(a, b)
+#define	MM_SET1_16(a)		_mm_set1_epi16(a)
+#define	MM_SET1_32(a)		_mm_set1_epi32(a)
+#define	MM_SET1_64(a)		_mm_set1_epi64(a)
+#define	MM_SET1_8(a)		_mm_set1_epi8(a)
+#define	MM_SET32(a,b,c,d)	_mm_set_epi32(a,b,c,d)
+#define	MM_SHUFFLE32(a, b)	_mm_shuffle_epi32(a, b)
+#define	MM_SHUFFLE8(a, b)	_mm_shuffle_epi8(a, b)
+#define	MM_SHUFFLEPS(a, b, c)	_mm_shuffle_ps(a, b, c)
+#define	MM_SIGN8(a, b)		_mm_sign_epi8(a, b)
+#define	MM_SLL64(a, b)		_mm_sll_epi64(a, b)
+#define	MM_SRL128(a, b)		_mm_srli_si128(a, b)
+#define MM_SRL16(a, b)		_mm_srli_epi16(a, b)
+#define	MM_SRL32(a, b)		_mm_srli_epi32(a, b)
+#define	MM_STORE(a, b)		_mm_store_si128(a, b)
+#define	MM_STOREU(a, b)		_mm_storeu_si128(a, b)
+#define	MM_TESTZ(a, b)		_mm_testz_si128(a, b)
+#define	MM_XOR(a, b)		_mm_xor_si128(a, b)
+
+#define	MM_SET16(a,b,c,d, e, f, g, h)	_mm_set_epi16(a,b,c,d, e, f, g, h)
+
+#define	MM_SET8(c0,c1,c2,c3,c4,c5,c6,c7,c8,c9,cA,cB,cC,cD,cE,cF)	\
+	_mm_set_epi8(c0,c1,c2,c3,c4,c5,c6,c7,c8,c9,cA,cB,cC,cD,cE,cF)
+
+#ifdef RTE_ARCH_X86_64
+
+#define	MM_CVT64(a)		_mm_cvtsi128_si64(a)
+
+#else
+
+#define	MM_CVT64(a)	({ \
+	rte_xmm_t m;       \
+	m.m = (a);         \
+	(m.u64[0]);        \
+})
+
+#endif /*RTE_ARCH_X86_64 */
+
+/*
+ * Prior to version 12.1 icc doesn't support _mm_set_epi64x.
+ */
+#if (defined(__ICC) && __ICC < 1210)
+
+#define	MM_SET64(a,b)	({ \
+	rte_xmm_t m;       \
+	m.u64[0] = b;      \
+	m.u64[1] = a;      \
+	(m.m);             \
+})
+
+#else
+
+#define	MM_SET64(a,b)		_mm_set_epi64x(a,b)
+
+#endif /* (defined(__ICC) && __ICC < 1210) */
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_ACL_VECT_H_ */
diff --git a/lib/librte_acl/rte_acl.c b/lib/librte_acl/rte_acl.c
new file mode 100644
index 0000000..0efe117
--- /dev/null
+++ b/lib/librte_acl/rte_acl.c
@@ -0,0 +1,413 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <rte_acl.h>
+#include "acl.h"
+
+#define	BIT_SIZEOF(x)	(sizeof(x) * CHAR_BIT)
+
+TAILQ_HEAD(rte_acl_list, rte_acl_ctx);
+
+struct rte_acl_ctx *
+rte_acl_find_existing(const char *name)
+{
+	struct rte_acl_ctx *ctx;
+	struct rte_acl_list *acl_list;
+
+	/* check that we have an initialised tail queue */
+	if ((acl_list = RTE_TAILQ_LOOKUP_BY_IDX(RTE_TAILQ_ACL,
+			rte_acl_list)) == NULL) {
+		rte_errno = E_RTE_NO_TAILQ;
+		return NULL;
+	}
+
+	rte_rwlock_read_lock(RTE_EAL_TAILQ_RWLOCK);
+	TAILQ_FOREACH(ctx, acl_list, next) {
+		if (strncmp(name, ctx->name, sizeof(ctx->name)) == 0)
+			break;
+	}
+	rte_rwlock_read_unlock(RTE_EAL_TAILQ_RWLOCK);
+
+	if (ctx == NULL)
+		rte_errno = ENOENT;
+	return (ctx);
+}
+
+void
+rte_acl_free(struct rte_acl_ctx *ctx)
+{
+	if (ctx == NULL)
+		return;
+
+	RTE_EAL_TAILQ_REMOVE(RTE_TAILQ_ACL, rte_acl_list, ctx);
+
+	rte_free(ctx->mem);
+	rte_free(ctx);
+}
+
+struct rte_acl_ctx *
+rte_acl_create(const struct rte_acl_param *param)
+{
+	size_t sz;
+	struct rte_acl_ctx *ctx;
+	struct rte_acl_list *acl_list;
+	char name[sizeof (ctx->name)];
+
+	/* check that we have an initialised tail queue */
+	if ((acl_list = RTE_TAILQ_LOOKUP_BY_IDX(RTE_TAILQ_ACL,
+			rte_acl_list)) == NULL) {
+		rte_errno = E_RTE_NO_TAILQ;
+		return NULL;
+	}
+
+	/* check that input parameters are valid. */
+	if (param == NULL || param->name == NULL) {
+		rte_errno = EINVAL;
+		return (NULL);
+	}
+
+	rte_snprintf(name, sizeof(name), "ACL_%s", param->name);
+
+	/* calculate amount of memory required for pattern set. */
+	sz = sizeof (*ctx) + param->max_rule_num * param->rule_size;
+
+	/* get EAL TAILQ lock. */
+	rte_rwlock_write_lock(RTE_EAL_TAILQ_RWLOCK);
+
+	/* if we already have one with that name */
+	TAILQ_FOREACH(ctx, acl_list, next) {
+		if (strncmp(param->name, ctx->name, sizeof (ctx->name)) == 0)
+			break;
+	}
+
+	/* if ACL with such name doesn't exist, then create a new one. */
+	if (ctx == NULL && (ctx = rte_zmalloc_socket(name, sz, CACHE_LINE_SIZE,
+			param->socket_id)) != NULL) {
+
+		/* init new allocated context. */
+		ctx->rules = ctx + 1;
+		ctx->max_rules = param->max_rule_num;
+		ctx->rule_sz = param->rule_size;
+		ctx->socket_id = param->socket_id;
+		rte_snprintf(ctx->name, sizeof(ctx->name), "%s", param->name);
+
+		TAILQ_INSERT_TAIL(acl_list, ctx, next);
+
+	} else if (ctx == NULL) {
+		RTE_LOG(ERR, ACL,
+			"allocation of %zu bytes on socket %d for %s failed\n",
+			sz, param->socket_id, name);
+	}
+
+	rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK);
+	return (ctx);
+}
+
+static int
+acl_add_rules(struct rte_acl_ctx *ctx, const void *rules, uint32_t num)
+{
+	uint8_t *pos;
+
+	if (num + ctx->num_rules > ctx->max_rules)
+		return (-ENOMEM);
+
+	pos = ctx->rules;
+	pos += ctx->rule_sz * ctx->num_rules;
+	memcpy(pos, rules, num * ctx->rule_sz);
+	ctx->num_rules += num;
+
+	return (0);
+}
+
+static int
+acl_check_rule(const struct rte_acl_rule_data *rd)
+{
+	if ((rd->category_mask & LEN2MASK(RTE_ACL_MAX_CATEGORIES)) == 0 ||
+			rd->priority > RTE_ACL_MAX_PRIORITY ||
+			rd->priority < RTE_ACL_MIN_PRIORITY ||
+			rd->userdata == RTE_ACL_INVALID_USERDATA)
+		return (-EINVAL);
+	return (0);
+}
+
+int
+rte_acl_add_rules(struct rte_acl_ctx *ctx, const struct rte_acl_rule *rules,
+	uint32_t num)
+{
+	const struct rte_acl_rule *rv;
+	uint32_t i;
+	int32_t rc;
+
+	if (ctx == NULL || rules == NULL || 0 == ctx->rule_sz)
+		return (-EINVAL);
+
+	for (i = 0; i != num; i++) {
+		rv = (const struct rte_acl_rule *)
+			((uintptr_t)rules + i * ctx->rule_sz);
+		if ((rc = acl_check_rule(&rv->data)) != 0) {
+			RTE_LOG(ERR, ACL, "%s(%s): rule #%u is invalid\n",
+				__func__, ctx->name, i + 1);
+			return (rc);
+		}
+	}
+
+	return (acl_add_rules(ctx, rules, num));
+}
+
+/*
+ * Reset all rules.
+ * Note that RT structures are not affected.
+ */
+void
+rte_acl_reset_rules(struct rte_acl_ctx *ctx)
+{
+	if (ctx != NULL)
+		ctx->num_rules = 0;
+}
+
+/*
+ * Reset all rules and destroys RT structures.
+ */
+void
+rte_acl_reset(struct rte_acl_ctx *ctx)
+{
+	if (ctx != NULL) {
+		rte_acl_reset_rules(ctx);
+		rte_acl_build(ctx, &ctx->config);
+	}
+}
+
+/*
+ * Dump ACL context to the stdout.
+ */
+void
+rte_acl_dump(const struct rte_acl_ctx *ctx)
+{
+	if (!ctx)
+		return;
+	printf("acl context <%s>@%p\n", ctx->name, ctx);
+	printf("  max_rules=%"PRIu32"\n", ctx->max_rules);
+	printf("  rule_size=%"PRIu32"\n", ctx->rule_sz);
+	printf("  num_rules=%"PRIu32"\n", ctx->num_rules);
+	printf("  num_categories=%"PRIu32"\n", ctx->num_categories);
+	printf("  num_tries=%"PRIu32"\n", ctx->num_tries);
+}
+
+/*
+ * Dump all ACL contexts to the stdout.
+ */
+void
+rte_acl_list_dump(void)
+{
+	struct rte_acl_ctx *ctx;
+	struct rte_acl_list *acl_list;
+
+	/* check that we have an initialised tail queue */
+	if ((acl_list = RTE_TAILQ_LOOKUP_BY_IDX(RTE_TAILQ_ACL,
+			rte_acl_list)) == NULL) {
+		rte_errno = E_RTE_NO_TAILQ;
+		return;
+	}
+
+	rte_rwlock_read_lock(RTE_EAL_TAILQ_RWLOCK);
+	TAILQ_FOREACH(ctx, acl_list, next) {
+		rte_acl_dump(ctx);
+	}
+	rte_rwlock_read_unlock(RTE_EAL_TAILQ_RWLOCK);
+}
+
+/*
+ * Support for legacy ipv4vlan rules.
+ */
+
+RTE_ACL_RULE_DEF(acl_ipv4vlan_rule, RTE_ACL_IPV4VLAN_NUM_FIELDS);
+
+static int
+acl_ipv4vlan_check_rule(const struct rte_acl_ipv4vlan_rule *rule)
+{
+	if (rule->src_port_low > rule->src_port_high ||
+			rule->dst_port_low > rule->dst_port_high ||
+			rule->src_mask_len > BIT_SIZEOF(rule->src_addr) ||
+			rule->dst_mask_len > BIT_SIZEOF(rule->dst_addr))
+		return (-EINVAL);
+
+	return (acl_check_rule(&rule->data));
+}
+
+static void
+acl_ipv4vlan_convert_rule(const struct rte_acl_ipv4vlan_rule *ri,
+	struct acl_ipv4vlan_rule *ro)
+{
+	ro->data = ri->data;
+
+	ro->field[RTE_ACL_IPV4VLAN_PROTO_FIELD].value.u8 = ri->proto;
+	ro->field[RTE_ACL_IPV4VLAN_VLAN1_FIELD].value.u16 = ri->vlan;
+	ro->field[RTE_ACL_IPV4VLAN_VLAN2_FIELD].value.u16 = ri->domain;
+	ro->field[RTE_ACL_IPV4VLAN_SRC_FIELD].value.u32 = ri->src_addr;
+	ro->field[RTE_ACL_IPV4VLAN_DST_FIELD].value.u32 = ri->dst_addr;
+	ro->field[RTE_ACL_IPV4VLAN_SRCP_FIELD].value.u16 = ri->src_port_low;
+	ro->field[RTE_ACL_IPV4VLAN_DSTP_FIELD].value.u16 = ri->dst_port_low;
+
+	ro->field[RTE_ACL_IPV4VLAN_PROTO_FIELD].mask_range.u8 = ri->proto_mask;
+	ro->field[RTE_ACL_IPV4VLAN_VLAN1_FIELD].mask_range.u16 = ri->vlan_mask;
+	ro->field[RTE_ACL_IPV4VLAN_VLAN2_FIELD].mask_range.u16 =
+		ri->domain_mask;
+	ro->field[RTE_ACL_IPV4VLAN_SRC_FIELD].mask_range.u32 =
+		ri->src_mask_len;
+	ro->field[RTE_ACL_IPV4VLAN_DST_FIELD].mask_range.u32 = ri->dst_mask_len;
+	ro->field[RTE_ACL_IPV4VLAN_SRCP_FIELD].mask_range.u16 =
+		ri->src_port_high;
+	ro->field[RTE_ACL_IPV4VLAN_DSTP_FIELD].mask_range.u16 =
+		ri->dst_port_high;
+}
+
+int
+rte_acl_ipv4vlan_add_rules(struct rte_acl_ctx *ctx,
+	const struct rte_acl_ipv4vlan_rule *rules,
+	uint32_t num)
+{
+	int32_t rc;
+	uint32_t i;
+	struct acl_ipv4vlan_rule rv;
+
+	if (ctx == NULL || rules == NULL || ctx->rule_sz != sizeof (rv))
+		return (-EINVAL);
+
+	/* check input rules. */
+	for (i = 0; i != num; i++) {
+		if ((rc = acl_ipv4vlan_check_rule(rules + i)) != 0) {
+			RTE_LOG(ERR, ACL, "%s(%s): rule #%u is invalid\n",
+				__func__, ctx->name, i + 1);
+			return (rc);
+		}
+	}
+
+	if (num + ctx->num_rules > ctx->max_rules)
+		return (-ENOMEM);
+
+	/* perform conversion to the internal format and add to the context. */
+	for (i = 0, rc = 0; i != num && rc == 0; i++) {
+		acl_ipv4vlan_convert_rule(rules + i, &rv);
+		rc = acl_add_rules(ctx, &rv, 1);
+	}
+
+	return (rc);
+}
+
+static void
+acl_ipv4vlan_config(struct rte_acl_config *cfg,
+	const uint32_t layout[RTE_ACL_IPV4VLAN_NUM],
+	uint32_t num_categories)
+{
+	static const struct rte_acl_field_def
+		ipv4_defs[RTE_ACL_IPV4VLAN_NUM_FIELDS] = {
+		{
+			.type = RTE_ACL_FIELD_TYPE_BITMASK,
+			.size = sizeof (uint8_t),
+			.field_index = RTE_ACL_IPV4VLAN_PROTO_FIELD,
+			.input_index = RTE_ACL_IPV4VLAN_PROTO,
+		},
+		{
+			.type = RTE_ACL_FIELD_TYPE_BITMASK,
+			.size = sizeof (uint16_t),
+			.field_index = RTE_ACL_IPV4VLAN_VLAN1_FIELD,
+			.input_index = RTE_ACL_IPV4VLAN_VLAN,
+		},
+		{
+			.type = RTE_ACL_FIELD_TYPE_BITMASK,
+			.size = sizeof (uint16_t),
+			.field_index = RTE_ACL_IPV4VLAN_VLAN2_FIELD,
+			.input_index = RTE_ACL_IPV4VLAN_VLAN,
+		},
+		{
+			.type = RTE_ACL_FIELD_TYPE_MASK,
+			.size = sizeof (uint32_t),
+			.field_index = RTE_ACL_IPV4VLAN_SRC_FIELD,
+			.input_index = RTE_ACL_IPV4VLAN_SRC,
+		},
+		{
+			.type = RTE_ACL_FIELD_TYPE_MASK,
+			.size = sizeof (uint32_t),
+			.field_index = RTE_ACL_IPV4VLAN_DST_FIELD,
+			.input_index = RTE_ACL_IPV4VLAN_DST,
+		},
+		{
+			.type = RTE_ACL_FIELD_TYPE_RANGE,
+			.size = sizeof (uint16_t),
+			.field_index = RTE_ACL_IPV4VLAN_SRCP_FIELD,
+			.input_index = RTE_ACL_IPV4VLAN_PORTS,
+		},
+		{
+			.type = RTE_ACL_FIELD_TYPE_RANGE,
+			.size = sizeof (uint16_t),
+			.field_index = RTE_ACL_IPV4VLAN_DSTP_FIELD,
+			.input_index = RTE_ACL_IPV4VLAN_PORTS,
+		},
+	};
+
+	memcpy(&cfg->defs, ipv4_defs, sizeof(ipv4_defs));
+	cfg->num_fields = RTE_DIM(ipv4_defs);
+
+	cfg->defs[RTE_ACL_IPV4VLAN_PROTO_FIELD].offset =
+		layout[RTE_ACL_IPV4VLAN_PROTO];
+	cfg->defs[RTE_ACL_IPV4VLAN_VLAN1_FIELD].offset =
+		layout[RTE_ACL_IPV4VLAN_VLAN];
+	cfg->defs[RTE_ACL_IPV4VLAN_VLAN2_FIELD].offset =
+		layout[RTE_ACL_IPV4VLAN_VLAN] +
+		cfg->defs[RTE_ACL_IPV4VLAN_VLAN1_FIELD].size;
+	cfg->defs[RTE_ACL_IPV4VLAN_SRC_FIELD].offset =
+		layout[RTE_ACL_IPV4VLAN_SRC];
+	cfg->defs[RTE_ACL_IPV4VLAN_DST_FIELD].offset =
+		layout[RTE_ACL_IPV4VLAN_DST];
+	cfg->defs[RTE_ACL_IPV4VLAN_SRCP_FIELD].offset =
+		layout[RTE_ACL_IPV4VLAN_PORTS];
+	cfg->defs[RTE_ACL_IPV4VLAN_DSTP_FIELD].offset =
+		layout[RTE_ACL_IPV4VLAN_PORTS] +
+		cfg->defs[RTE_ACL_IPV4VLAN_SRCP_FIELD].size;
+
+	cfg->num_categories = num_categories;
+}
+
+int
+rte_acl_ipv4vlan_build(struct rte_acl_ctx *ctx,
+	const uint32_t layout[RTE_ACL_IPV4VLAN_NUM],
+	uint32_t num_categories)
+{
+	struct rte_acl_config cfg;
+
+	if (ctx == NULL || layout == NULL)
+		return (-EINVAL);
+
+	acl_ipv4vlan_config(&cfg, layout, num_categories);
+	return (rte_acl_build(ctx, &cfg));
+}
diff --git a/lib/librte_acl/rte_acl.h b/lib/librte_acl/rte_acl.h
new file mode 100644
index 0000000..afc0f69
--- /dev/null
+++ b/lib/librte_acl/rte_acl.h
@@ -0,0 +1,453 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_ACL_H_
+#define _RTE_ACL_H_
+
+/**
+ * @file
+ *
+ * RTE Classifier.
+ */
+
+#include <rte_acl_osdep.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define	RTE_ACL_MAX_CATEGORIES	16
+
+#define	RTE_ACL_RESULTS_MULTIPLIER	(XMM_SIZE / sizeof(uint32_t))
+
+#define RTE_ACL_MAX_LEVELS 64
+#define RTE_ACL_MAX_FIELDS 64
+
+union rte_acl_field_types {
+	uint8_t  u8;
+	uint16_t u16;
+	uint32_t u32;
+	uint64_t u64;
+};
+
+enum {
+	RTE_ACL_FIELD_TYPE_MASK = 0,
+	RTE_ACL_FIELD_TYPE_RANGE,
+	RTE_ACL_FIELD_TYPE_BITMASK
+};
+
+/**
+ * ACL Field defintion.
+ * Each field in the ACL rule has an associate definition.
+ * It defines the type of field, its size, its offset in the input buffer,
+ * the field index, and the input index.
+ * For performance reasons, the inner loop of the search function is unrolled
+ * to process four input bytes at a time. This requires the input to be grouped
+ * into sets of 4 consecutive bytes. The loop processes the first input byte as
+ * part of the setup and then subsequent bytes must be in groups of 4
+ * consecutive bytes.
+ */
+struct rte_acl_field_def {
+	uint8_t  type;        /**< type - RTE_ACL_FIELD_TYPE_*. */
+	uint8_t	 size;        /**< size of field 1,2,4, or 8. */
+	uint8_t	 field_index; /**< index of field inside the rule. */
+	uint8_t  input_index; /**< 0-N input index. */
+	uint32_t offset;      /**< offset to start of field. */
+};
+
+/**
+ * ACL build configuration.
+ * Defines the fields of an ACL trie and number of categories to build with.
+ */
+struct rte_acl_config {
+	uint32_t num_categories; /**< Number of categories to build with. */
+	uint32_t num_fields;     /**< Number of field definitions. */
+	struct rte_acl_field_def defs[RTE_ACL_MAX_FIELDS];
+	/**< array of field definitions. */
+};
+
+/**
+ * Defines the value of a field for a rule.
+ */
+struct rte_acl_field {
+	union rte_acl_field_types value;
+	/**< a 1,2,4, or 8 byte value of the field. */
+	union rte_acl_field_types mask_range;
+	/**<
+	 * depending on field type:
+	 * mask -> 1.2.3.4/32 value=0x1020304, mask_range=32,
+	 * range -> 0 : 65535 value=0, mask_range=65535,
+	 * bitmask -> 0x06/0xff value=6, mask_range=0xff.
+	 */
+};
+
+enum {
+	RTE_ACL_TYPE_SHIFT = 29,
+	RTE_ACL_MAX_INDEX = LEN2MASK(RTE_ACL_TYPE_SHIFT),
+	RTE_ACL_MAX_PRIORITY = RTE_ACL_MAX_INDEX,
+	RTE_ACL_MIN_PRIORITY = 0,
+};
+
+#define	RTE_ACL_INVALID_USERDATA	0
+
+/**
+ * Miscellaneous data for ACL rule.
+ */
+struct rte_acl_rule_data {
+	uint32_t category_mask; /**< Mask of categories for that rule. */
+	int32_t  priority;      /**< Priority for that rule. */
+	uint32_t userdata;      /**< Associated with the rule user data. */
+};
+
+/**
+ * Defines single ACL rule.
+ * data - miscellaneous data for the rule.
+ * field[] - value and mask or range for each field.
+ */
+#define	RTE_ACL_RULE_DEF(name, fld_num)	struct name {\
+	struct rte_acl_rule_data data;               \
+	struct rte_acl_field field[fld_num];         \
+}
+
+RTE_ACL_RULE_DEF(rte_acl_rule, 0);
+
+#define	RTE_ACL_RULE_SZ(fld_num)	\
+	(sizeof(struct rte_acl_rule) + sizeof(struct rte_acl_field) * (fld_num))
+
+
+/** Max number of characters in name.*/
+#define	RTE_ACL_NAMESIZE		32
+
+/**
+ * Parameters used when creating the ACL context.
+ */
+struct rte_acl_param {
+	const char *name;         /**< Name of the ACL context. */
+	int         socket_id;    /**< Socket ID to allocate memory for. */
+	uint32_t    rule_size;    /**< Size of each rule. */
+	uint32_t    max_rule_num; /**< Maximum number of rules. */
+};
+
+
+/**
+ * Create a new ACL context.
+ *
+ * @param param
+ *   Parameters used to create and initialise the ACL context.
+ * @return
+ *   Pointer to ACL context structure that is used in future ACL
+ *   operations, or NULL on error, with error code set in rte_errno.
+ *   Possible rte_errno errors include:
+ *   - E_RTE_NO_TAILQ - no tailq list could be got for the ACL context list
+ *   - EINVAL - invalid parameter passed to function
+ */
+struct rte_acl_ctx *
+rte_acl_create(const struct rte_acl_param *param);
+
+/**
+ * Find an existing ACL context object and return a pointer to it.
+ *
+ * @param name
+ *   Name of the ACL context as passed to rte_acl_create()
+ * @return
+ *   Pointer to ACL context or NULL if object not found
+ *   with rte_errno set appropriately. Possible rte_errno values include:
+ *    - ENOENT - value not available for return
+ */
+struct rte_acl_ctx *
+rte_acl_find_existing(const char *name);
+
+/**
+ * De-allocate all memory used by ACL context.
+ *
+ * @param ctx
+ *   ACL context to free
+ */
+void
+rte_acl_free(struct rte_acl_ctx *ctx);
+
+/**
+ * Add rules to an existing ACL context.
+ * This function is not multi-thread safe.
+ *
+ * @param ctx
+ *   ACL context to add patterns to.
+ * @param rules
+ *   Array of rules to add to the ACL context.
+ *   Note that all fields in rte_acl_rule structures are expected
+ *   to be in host byte order.
+ *   Each rule expected to be in the same format and not exceed size
+ *   specified at ACL context creation time.
+ * @param num
+ *   Number of elements in the input array of rules.
+ * @return
+ *   - -ENOMEM if there is no space in the ACL context for these rules.
+ *   - -EINVAL if the parameters are invalid.
+ *   - Zero if operation completed successfully.
+ */
+int
+rte_acl_add_rules(struct rte_acl_ctx *ctx, const struct rte_acl_rule *rules,
+	uint32_t num);
+
+/**
+ * Delete all rules from the ACL context.
+ * This function is not multi-thread safe.
+ * Note that internal run-time structures are not affected.
+ *
+ * @param ctx
+ *   ACL context to delete rules from.
+ */
+void
+rte_acl_reset_rules(struct rte_acl_ctx *ctx);
+
+/**
+ * Analyze set of rules and build required internal run-time structures.
+ * This function is not multi-thread safe.
+ *
+ * @param ctx
+ *   ACL context to build.
+ * @param cfg
+ *   Pointer to struct rte_acl_config - defines build parameters.
+ * @return
+ *   - -ENOMEM if couldn't allocate enough memory.
+ *   - -EINVAL if the parameters are invalid.
+ *   - Negative error code if operation failed.
+ *   - Zero if operation completed successfully.
+ */
+int
+rte_acl_build(struct rte_acl_ctx *ctx, const struct rte_acl_config *cfg);
+
+/**
+ * Delete all rules from the ACL context and
+ * destroy all internal run-time structures.
+ * This function is not multi-thread safe.
+ *
+ * @param ctx
+ *   ACL context to reset.
+ */
+void
+rte_acl_reset(struct rte_acl_ctx *ctx);
+
+/**
+ * Search for a matching ACL rule for each input data buffer.
+ * Each input data buffer can have up to *categories* matches.
+ * That implies that results array should be big enough to hold
+ * (categories * num) elements.
+ * Also categories parameter should be either one or multiple of
+ * RTE_ACL_RESULTS_MULTIPLIER and can't be bigger than RTE_ACL_MAX_CATEGORIES.
+ * If more than one rule is applicable for given input buffer and
+ * given category, then rule with highest priority will be returned as a match.
+ * Note, that it is a caller responsibility to ensure that input parameters
+ * are valid and point to correct memory locations.
+ *
+ * @param ctx
+ *   ACL context to search with.
+ * @param data
+ *   Array of pointers to input data buffers to perform search.
+ *   Note that all fields in input data buffers supposed to be in network
+ *   byte order (MSB).
+ * @param results
+ *   Array of search results, *categories* results per each input data buffer.
+ * @param num
+ *   Number of elements in the input data buffers array.
+ * @param categories
+ *   Number of maximum possible matches for each input buffer, one possible
+ *   match per category.
+ * @return
+ *   zero on successful completion.
+ *   -EINVAL for incorrect arguments.
+ */
+int
+rte_acl_classify(const struct rte_acl_ctx *ctx, const uint8_t **data,
+	uint32_t *results, uint32_t num, uint32_t categories);
+
+/**
+ * Perform scalar search for a matching ACL rule for each input data buffer.
+ * Note, that while the search itself will avoid explicit use of SSE/AVX
+ * intrinsics, code for comparing matching results/priorities sill might use
+ * vector intrinsics (for  categories > 1).
+ * Each input data buffer can have up to *categories* matches.
+ * That implies that results array should be big enough to hold
+ * (categories * num) elements.
+ * Also categories parameter should be either one or multiple of
+ * RTE_ACL_RESULTS_MULTIPLIER and can't be bigger than RTE_ACL_MAX_CATEGORIES.
+ * If more than one rule is applicable for given input buffer and
+ * given category, then rule with highest priority will be returned as a match.
+ * Note, that it is a caller's responsibility to ensure that input parameters
+ * are valid and point to correct memory locations.
+ *
+ * @param ctx
+ *   ACL context to search with.
+ * @param data
+ *   Array of pointers to input data buffers to perform search.
+ *   Note that all fields in input data buffers supposed to be in network
+ *   byte order (MSB).
+ * @param results
+ *   Array of search results, *categories* results per each input data buffer.
+ * @param num
+ *   Number of elements in the input data buffers array.
+ * @param categories
+ *   Number of maximum possible matches for each input buffer, one possible
+ *   match per category.
+ * @return
+ *   zero on successful completion.
+ *   -EINVAL for incorrect arguments.
+ */
+int
+rte_acl_classify_scalar(const struct rte_acl_ctx *ctx, const uint8_t **data,
+	uint32_t *results, uint32_t num, uint32_t categories);
+
+/**
+ * Dump an ACL context structure to the console.
+ *
+ * @param ctx
+ *   ACL context to dump.
+ */
+void
+rte_acl_dump(const struct rte_acl_ctx *ctx);
+
+/**
+ * Dump all ACL context structures to the console.
+ */
+void
+rte_acl_list_dump(void);
+
+/**
+ * Legacy support for 7-tuple IPv4 and VLAN rule.
+ * This structure and corresponding API is deprecated.
+ */
+struct rte_acl_ipv4vlan_rule {
+	struct rte_acl_rule_data data; /**< Miscellaneous data for the rule. */
+	uint8_t proto;                 /**< IPv4 protocol ID. */
+	uint8_t proto_mask;            /**< IPv4 protocol ID mask. */
+	uint16_t vlan;                 /**< VLAN ID. */
+	uint16_t vlan_mask;            /**< VLAN ID mask. */
+	uint16_t domain;               /**< VLAN domain. */
+	uint16_t domain_mask;          /**< VLAN domain mask. */
+	uint32_t src_addr;             /**< IPv4 source address. */
+	uint32_t src_mask_len;         /**< IPv4 source address mask. */
+	uint32_t dst_addr;             /**< IPv4 destination address. */
+	uint32_t dst_mask_len;         /**< IPv4 destination address mask. */
+	uint16_t src_port_low;         /**< L4 source port low. */
+	uint16_t src_port_high;        /**< L4 source port high. */
+	uint16_t dst_port_low;         /**< L4 destination port low. */
+	uint16_t dst_port_high;        /**< L4 destination port high. */
+};
+
+/**
+ * Specifies fields layout inside rte_acl_rule for rte_acl_ipv4vlan_rule.
+ */
+enum {
+	RTE_ACL_IPV4VLAN_PROTO_FIELD,
+	RTE_ACL_IPV4VLAN_VLAN1_FIELD,
+	RTE_ACL_IPV4VLAN_VLAN2_FIELD,
+	RTE_ACL_IPV4VLAN_SRC_FIELD,
+	RTE_ACL_IPV4VLAN_DST_FIELD,
+	RTE_ACL_IPV4VLAN_SRCP_FIELD,
+	RTE_ACL_IPV4VLAN_DSTP_FIELD,
+	RTE_ACL_IPV4VLAN_NUM_FIELDS
+};
+
+/**
+ * Macro to define rule size for rte_acl_ipv4vlan_rule.
+ */
+#define	RTE_ACL_IPV4VLAN_RULE_SZ	\
+	RTE_ACL_RULE_SZ(RTE_ACL_IPV4VLAN_NUM_FIELDS)
+
+/*
+ * That effectively defines order of IPV4VLAN classifications:
+ *  - PROTO
+ *  - VLAN (TAG and DOMAIN)
+ *  - SRC IP ADDRESS
+ *  - DST IP ADDRESS
+ *  - PORTS (SRC and DST)
+ */
+enum {
+	RTE_ACL_IPV4VLAN_PROTO,
+	RTE_ACL_IPV4VLAN_VLAN,
+	RTE_ACL_IPV4VLAN_SRC,
+	RTE_ACL_IPV4VLAN_DST,
+	RTE_ACL_IPV4VLAN_PORTS,
+	RTE_ACL_IPV4VLAN_NUM
+};
+
+/**
+ * Add ipv4vlan rules to an existing ACL context.
+ * This function is not multi-thread safe.
+ *
+ * @param ctx
+ *   ACL context to add patterns to.
+ * @param rules
+ *   Array of rules to add to the ACL context.
+ *   Note that all fields in rte_acl_ipv4vlan_rule structures are expected
+ *   to be in host byte order.
+ * @param num
+ *   Number of elements in the input array of rules.
+ * @return
+ *   - -ENOMEM if there is no space in the ACL context for these rules.
+ *   - -EINVAL if the parameters are invalid.
+ *   - Zero if operation completed successfully.
+ */
+int
+rte_acl_ipv4vlan_add_rules(struct rte_acl_ctx *ctx,
+	const struct rte_acl_ipv4vlan_rule *rules,
+	uint32_t num);
+
+/**
+ * Analyze set of ipv4vlan rules and build required internal
+ * run-time structures.
+ * This function is not multi-thread safe.
+ *
+ * @param ctx
+ *   ACL context to build.
+ * @param layout
+ *   Layout of input data to search through.
+ * @param num_categories
+ *   Maximum number of categories to use in that build.
+ * @return
+ *   - -ENOMEM if couldn't allocate enough memory.
+ *   - -EINVAL if the parameters are invalid.
+ *   - Negative error code if operation failed.
+ *   - Zero if operation completed successfully.
+ */
+int
+rte_acl_ipv4vlan_build(struct rte_acl_ctx *ctx,
+	const uint32_t layout[RTE_ACL_IPV4VLAN_NUM],
+	uint32_t num_categories);
+
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_ACL_H_ */
diff --git a/lib/librte_acl/rte_acl_osdep.h b/lib/librte_acl/rte_acl_osdep.h
new file mode 100644
index 0000000..046b22d
--- /dev/null
+++ b/lib/librte_acl/rte_acl_osdep.h
@@ -0,0 +1,92 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_ACL_OSDEP_H_
+#define _RTE_ACL_OSDEP_H_
+
+/**
+ * @file
+ *
+ * RTE ACL DPDK/OS dependent file.
+ */
+
+#include <stdint.h>
+#include <stddef.h>
+#include <inttypes.h>
+#include <limits.h>
+#include <ctype.h>
+#include <string.h>
+#include <errno.h>
+#include <stdio.h>
+#include <stdarg.h>
+#include <stdlib.h>
+#include <sys/queue.h>
+
+/*
+ * Common defines.
+ */
+
+#define	LEN2MASK(ln)	((uint32_t)(((uint64_t)1 << (ln)) - 1))
+
+#define DIM(x) RTE_DIM(x)
+
+/*
+ * To build ACL standalone.
+ */
+#ifdef RTE_LIBRTE_ACL_STANDALONE
+#include <rte_acl_osdep_alone.h>
+#else
+
+#include <rte_common.h>
+#include <rte_common_vect.h>
+#include <rte_memory.h>
+#include <rte_log.h>
+#include <rte_memcpy.h>
+#include <rte_prefetch.h>
+#include <rte_byteorder.h>
+#include <rte_branch_prediction.h>
+#include <rte_memzone.h>
+#include <rte_malloc.h>
+#include <rte_tailq.h>
+#include <rte_eal.h>
+#include <rte_eal_memconfig.h>
+#include <rte_per_lcore.h>
+#include <rte_errno.h>
+#include <rte_string_fns.h>
+#include <rte_cpuflags.h>
+#include <rte_log.h>
+#include <rte_debug.h>
+
+#endif /* RTE_LIBRTE_ACL_STANDALONE */
+
+#endif /* _RTE_ACL_OSDEP_H_ */
diff --git a/lib/librte_acl/rte_acl_osdep_alone.h b/lib/librte_acl/rte_acl_osdep_alone.h
new file mode 100644
index 0000000..16c0cea
--- /dev/null
+++ b/lib/librte_acl/rte_acl_osdep_alone.h
@@ -0,0 +1,277 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_ACL_OSDEP_ALONE_H_
+#define _RTE_ACL_OSDEP_ALONE_H_
+
+/**
+ * @file
+ *
+ * RTE ACL OS dependent file.
+ * An example how to build/use ACL library standalone
+ * (without rest of DPDK).
+ * Don't include that file on it's own, use <rte_acl_osdep.h>.
+ */
+
+#if (defined(__ICC) || (__GNUC__ == 4 &&  __GNUC_MINOR__ < 4))
+
+#ifdef __SSE__
+#include <xmmintrin.h>
+#endif
+
+#ifdef __SSE2__
+#include <emmintrin.h>
+#endif
+
+#if defined (__SSE4_2__) || defined (__SSE4_1__)
+#include <smmintrin.h>
+#endif
+
+#else
+
+#include <x86intrin.h>
+
+#endif
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define	DUMMY_MACRO	do {} while (0)
+
+/*
+ * rte_common related.
+ */
+#define	__rte_unused	__attribute__((__unused__))
+
+#define RTE_PTR_ADD(ptr, x)	((typeof(ptr))((uintptr_t)(ptr) + (x)))
+
+#define	RTE_PTR_ALIGN_FLOOR(ptr, align) \
+	(typeof(ptr))((uintptr_t)(ptr) & ~((uintptr_t)(align) - 1))
+
+#define	RTE_PTR_ALIGN_CEIL(ptr, align) \
+	RTE_PTR_ALIGN_FLOOR(RTE_PTR_ADD(ptr, (align) - 1), align)
+
+#define	RTE_PTR_ALIGN(ptr, align)	RTE_PTR_ALIGN_CEIL(ptr, align)
+
+#define	RTE_ALIGN_FLOOR(val, align) \
+	(typeof(val))((val) & (~((typeof(val))((align) - 1))))
+
+#define	RTE_ALIGN_CEIL(val, align) \
+	RTE_ALIGN_FLOOR(((val) + ((typeof(val))(align) - 1)), align)
+
+#define	RTE_ALIGN(ptr, align)	RTE_ALIGN_CEIL(ptr, align)
+
+#define	RTE_MIN(a, b)	({ \
+		typeof (a) _a = (a); \
+		typeof (b) _b = (b); \
+		_a < _b ? _a : _b;   \
+	})
+
+#define	RTE_DIM(a)		(sizeof (a) / sizeof ((a)[0]))
+
+/**
+ * Searches the input parameter for the least significant set bit
+ * (starting from zero).
+ * If a least significant 1 bit is found, its bit index is returned.
+ * If the content of the input paramer is zero, then the content of the return
+ * value is undefined.
+ * @param v
+ *     input parameter, should not be zero.
+ * @return
+ *     least significant set bit in the input parameter.
+ */
+static inline uint32_t
+rte_bsf32(uint32_t v)
+{
+	asm("bsf %1,%0"
+		: "=r" (v)
+		: "rm" (v));
+	return (v);
+}
+
+/*
+ * rte_common_vect related.
+ */
+typedef __m128i xmm_t;
+
+#define	XMM_SIZE	(sizeof (xmm_t))
+#define	XMM_MASK	(XMM_SIZE - 1)
+
+typedef union rte_mmsse {
+    xmm_t    m;
+    uint8_t  u8[XMM_SIZE / sizeof (uint8_t)];
+    uint16_t u16[XMM_SIZE / sizeof (uint16_t)];
+    uint32_t u32[XMM_SIZE / sizeof (uint32_t)];
+    uint64_t u64[XMM_SIZE / sizeof (uint64_t)];
+    double   pd[XMM_SIZE / sizeof (double)];
+} rte_xmm_t;
+
+/*
+ * rte_cycles related.
+ */
+static inline uint64_t
+rte_rdtsc(void)
+{
+	union {
+		uint64_t tsc_64;
+		struct {
+			uint32_t lo_32;
+			uint32_t hi_32;
+		};
+	} tsc;
+
+	asm volatile("rdtsc" :
+		"=a" (tsc.lo_32),
+		"=d" (tsc.hi_32));
+	return tsc.tsc_64;
+}
+
+/*
+ * rte_lcore related.
+ */
+#define rte_lcore_id()	(0)
+
+/*
+ * rte_errno related.
+ */
+#define	rte_errno	errno
+#define	E_RTE_NO_TAILQ	(-1)
+
+/*
+ * rte_rwlock related.
+ */
+#define	rte_rwlock_read_lock(x)		DUMMY_MACRO
+#define	rte_rwlock_read_unlock(x)	DUMMY_MACRO
+#define	rte_rwlock_write_lock(x)	DUMMY_MACRO
+#define	rte_rwlock_write_unlock(x)	DUMMY_MACRO
+
+/*
+ * rte_memory related.
+ */
+#define	SOCKET_ID_ANY	-1                  /**< Any NUMA socket. */
+#define	CACHE_LINE_SIZE	64                  /**< Cache line size. */
+#define	CACHE_LINE_MASK	(CACHE_LINE_SIZE-1) /**< Cache line mask. */
+
+/**
+ * Force alignment to cache line.
+ */
+#define	__rte_cache_aligned	__attribute__((__aligned__(CACHE_LINE_SIZE)))
+
+
+/*
+ * rte_byteorder related.
+ */
+#define	rte_le_to_cpu_16(x)	(x)
+#define	rte_le_to_cpu_32(x)	(x)
+
+#define rte_cpu_to_be_16(x)	\
+	(((x) & UINT8_MAX) << CHAR_BIT | ((x) >> CHAR_BIT & UINT8_MAX))
+#define rte_cpu_to_be_32(x)	__builtin_bswap32(x)
+
+/*
+ * rte_branch_prediction related.
+ */
+#ifndef	likely
+#define	likely(x)	__builtin_expect((x),1)
+#endif	/* likely */
+
+#ifndef	unlikely
+#define	unlikely(x)	__builtin_expect((x),0)
+#endif	/* unlikely */
+
+
+/*
+ * rte_tailq related.
+ */
+static inline void *
+rte_dummy_tailq(void)
+{
+	static __thread TAILQ_HEAD(rte_dummy_head, rte_dummy) dummy_head;
+	TAILQ_INIT(&dummy_head);
+	return (&dummy_head);
+}
+
+#define	RTE_TAILQ_LOOKUP_BY_IDX(idx, struct_name)	rte_dummy_tailq()
+
+#define RTE_EAL_TAILQ_REMOVE(idx, type, elm)	DUMMY_MACRO
+
+/*
+ * rte_string related
+ */
+#define	rte_snprintf(str, len, frmt, args...)	snprintf(str, len, frmt, ##args)
+
+/*
+ * rte_log related
+ */
+#define RTE_LOG(l, t, fmt, args...)	printf(fmt, ##args)
+
+/*
+ * rte_malloc related
+ */
+#define	rte_free(x)	free(x)
+
+static inline void *
+rte_zmalloc_socket(__rte_unused const char *type, size_t size, unsigned align,
+	__rte_unused int socket)
+{
+	void *ptr;
+	int rc;
+
+	if ((rc = posix_memalign(&ptr, align, size)) != 0) {
+		rte_errno = rc;
+		return (NULL);
+	}
+
+	memset(ptr, 0, size);
+	return (ptr);
+}
+
+/*
+ * rte_debug related
+ */
+#define	rte_panic(fmt, args...)	do {         \
+	RTE_LOG(CRIT, EAL, fmt, ##args);     \
+	abort();                             \
+} while (0)
+
+#define	rte_exit(err, fmt, args...)	do { \
+	RTE_LOG(CRIT, EAL, fmt, ##args);     \
+	exit(err);                           \
+} while (0)
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_ACL_OSDEP_ALONE_H_ */
diff --git a/lib/librte_acl/tb_mem.c b/lib/librte_acl/tb_mem.c
new file mode 100644
index 0000000..817d0c8
--- /dev/null
+++ b/lib/librte_acl/tb_mem.c
@@ -0,0 +1,102 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "tb_mem.h"
+
+/*
+ *  Memory managment routines for temporary memory.
+ *  That memory is used only during build phase and is released after
+ *  build is finished.
+ */
+
+static struct tb_mem_block *
+tb_pool(struct tb_mem_pool *pool, size_t sz)
+{
+	struct tb_mem_block *block;
+	uint8_t *ptr;
+	size_t size;
+
+	size = sz + pool->alignment - 1;
+	if ((block = calloc(1, size + sizeof(*pool->block))) == NULL) {
+		RTE_LOG(ERR, MALLOC, "%s(%zu)\n failed, currently allocated "
+			"by pool: %zu bytes\n", __func__, sz, pool->alloc);
+		return (NULL);
+	}
+
+	block->pool = pool;
+
+	block->next = pool->block;
+	pool->block = block;
+
+	pool->alloc += size;
+
+	ptr = (uint8_t *)(block + 1);
+	block->mem = RTE_PTR_ALIGN_CEIL(ptr, pool->alignment);
+	block->size = size - (block->mem - ptr);
+
+	return (block);
+}
+
+void *
+tb_alloc(struct tb_mem_pool *pool, size_t size)
+{
+	struct tb_mem_block *block;
+	void *ptr;
+	size_t new_sz;
+
+	size = RTE_ALIGN_CEIL(size, pool->alignment);
+
+	block = pool->block;
+	if (block == NULL || block->size < size) {
+		new_sz = (size > pool->min_alloc) ? size : pool->min_alloc;
+		if ((block = tb_pool(pool, new_sz)) == NULL)
+			return (NULL);
+	}
+	ptr = block->mem;
+	block->size -= size;
+	block->mem += size;
+	return (ptr);
+}
+
+void
+tb_free_pool(struct tb_mem_pool *pool)
+{
+	struct tb_mem_block *next, *block;
+
+	for (block = pool->block; block != NULL; block = next) {
+		next = block->next;
+		free(block);
+	}
+	pool->block = NULL;
+	pool->alloc = 0;
+}
diff --git a/lib/librte_acl/tb_mem.h b/lib/librte_acl/tb_mem.h
new file mode 100644
index 0000000..a3ed795
--- /dev/null
+++ b/lib/librte_acl/tb_mem.h
@@ -0,0 +1,73 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _TB_MEM_H_
+#define _TB_MEM_H_
+
+/**
+ * @file
+ *
+ * RTE ACL temporary (build phase) memory managment.
+ * Contains structures and functions to manage temporary (used by build only)
+ * memory. Memory allocated in large blocks to speed 'free' when trie is
+ * destructed (finish of build phase).
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_acl_osdep.h>
+
+struct tb_mem_block {
+	struct tb_mem_block *next;
+	struct tb_mem_pool  *pool;
+	size_t               size;
+	uint8_t             *mem;
+};
+
+struct tb_mem_pool {
+	struct tb_mem_block *block;
+	size_t               alignment;
+	size_t               min_alloc;
+	size_t               alloc;
+};
+
+void *tb_alloc(struct tb_mem_pool *pool, size_t size);
+void tb_free_pool(struct tb_mem_pool *pool);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _TB_MEM_H_ */
-- 
1.7.7.6

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [dpdk-dev] [PATCHv2 2/5] acl: update UT to reflect latest changes in the librte_acl.
  2014-05-28 19:26 [dpdk-dev] [PATCHv2 0/5] ACL library Konstantin Ananyev
  2014-05-28 19:26 ` [dpdk-dev] [PATCHv2 1/5] acl: Add ACL library (librte_acl) into DPDK Konstantin Ananyev
@ 2014-05-28 19:26 ` Konstantin Ananyev
  2014-05-28 19:26 ` [dpdk-dev] [PATCHv2 3/5] acl: New test-acl application Konstantin Ananyev
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Konstantin Ananyev @ 2014-05-28 19:26 UTC (permalink / raw)
  To: dev, dev

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 app/test/test_acl.c |  128 ++++++++++++++++++++++++++++++++++-----------------
 1 files changed, 85 insertions(+), 43 deletions(-)

diff --git a/app/test/test_acl.c b/app/test/test_acl.c
index 790cdf3..c171eac 100644
--- a/app/test/test_acl.c
+++ b/app/test/test_acl.c
@@ -96,47 +96,13 @@ bswap_test_data(struct ipv4_7tuple * data, int len, int to_be)
  * Test scalar and SSE ACL lookup.
  */
 static int
-test_classify(void)
+test_classify_run(struct rte_acl_ctx * acx)
 {
-	struct rte_acl_ctx * acx;
 	int ret, i;
 	uint32_t result, count;
-
 	uint32_t results[RTE_DIM(acl_test_data) * RTE_ACL_MAX_CATEGORIES];
-
 	const uint8_t * data[RTE_DIM(acl_test_data)];
 
-	const uint32_t layout[RTE_ACL_IPV4VLAN_NUM] = {
-			offsetof(struct ipv4_7tuple, proto),
-			offsetof(struct ipv4_7tuple, vlan),
-			offsetof(struct ipv4_7tuple, ip_src),
-			offsetof(struct ipv4_7tuple, ip_dst),
-			offsetof(struct ipv4_7tuple, port_src),
-	};
-
-	acx = rte_acl_create(&acl_param);
-	if (acx == NULL) {
-		printf("Line %i: Error creating ACL context!\n", __LINE__);
-		return -1;
-	}
-
-	/* add rules to the context */
-	ret = rte_acl_ipv4vlan_add_rules(acx, acl_test_rules,
-			RTE_DIM(acl_test_rules));
-	if (ret != 0) {
-		printf("Line %i: Adding rules to ACL context failed!\n", __LINE__);
-		rte_acl_free(acx);
-		return -1;
-	}
-
-	/* try building the context */
-	ret = rte_acl_ipv4vlan_build(acx, layout, RTE_ACL_MAX_CATEGORIES);
-	if (ret != 0) {
-		printf("Line %i: Building ACL context failed!\n", __LINE__);
-		rte_acl_free(acx);
-		return -1;
-	}
-
 	/* swap all bytes in the data to network order */
 	bswap_test_data(acl_test_data, RTE_DIM(acl_test_data), 1);
 
@@ -213,21 +179,97 @@ test_classify(void)
 		}
 	}
 
-	/* free ACL context */
-	rte_acl_free(acx);
+	ret = 0;
 
+err:
 	/* swap data back to cpu order so that next time tests don't fail */
 	bswap_test_data(acl_test_data, RTE_DIM(acl_test_data), 0);
+	return (ret);
+}
 
-	return 0;
-err:
+static int
+test_classify_buid(struct rte_acl_ctx * acx)
+{
+	int ret;
+	const uint32_t layout[RTE_ACL_IPV4VLAN_NUM] = {
+			offsetof(struct ipv4_7tuple, proto),
+			offsetof(struct ipv4_7tuple, vlan),
+			offsetof(struct ipv4_7tuple, ip_src),
+			offsetof(struct ipv4_7tuple, ip_dst),
+			offsetof(struct ipv4_7tuple, port_src),
+	};
 
-	/* swap data back to cpu order so that next time tests don't fail */
-	bswap_test_data(acl_test_data, RTE_DIM(acl_test_data), 0);
+	/* add rules to the context */
+	ret = rte_acl_ipv4vlan_add_rules(acx, acl_test_rules,
+			RTE_DIM(acl_test_rules));
+	if (ret != 0) {
+		printf("Line %i: Adding rules to ACL context failed!\n",
+			__LINE__);
+		return (ret);
+	}
 
-	rte_acl_free(acx);
+	/* try building the context */
+	ret = rte_acl_ipv4vlan_build(acx, layout, RTE_ACL_MAX_CATEGORIES);
+	if (ret != 0) {
+		printf("Line %i: Building ACL context failed!\n", __LINE__);
+		return (ret);
+	}
 
-	return -1;
+	return (0);
+}
+
+#define	TEST_CLASSIFY_ITER	4
+
+/*
+ * Test scalar and SSE ACL lookup.
+ */
+static int
+test_classify(void)
+{
+	struct rte_acl_ctx * acx;
+	int i, ret;
+
+	acx = rte_acl_create(&acl_param);
+	if (acx == NULL) {
+		printf("Line %i: Error creating ACL context!\n", __LINE__);
+		return -1;
+	}
+
+	ret = 0;
+	for (i = 0; i != TEST_CLASSIFY_ITER; i++) {
+
+		if ((i & 1) == 0)
+			rte_acl_reset(acx);
+		else
+			rte_acl_reset_rules(acx);
+
+		ret = test_classify_buid(acx);
+		if (ret != 0) {
+			printf("Line %i, iter: %d: "
+				"Adding rules to ACL context failed!\n",
+				__LINE__, i);
+			break;
+		}
+
+		ret = test_classify_run(acx);
+		if (ret != 0) {
+			printf("Line %i, iter: %d: %s failed!\n",
+				__LINE__, i, __func__);
+			break;
+		}
+
+		/* reset rules and make sure that classify still works ok. */
+		rte_acl_reset_rules(acx);
+		ret = test_classify_run(acx);
+		if (ret != 0) {
+			printf("Line %i, iter: %d: %s failed!\n",
+				__LINE__, i, __func__);
+			break;
+		}
+	}
+
+	rte_acl_free(acx);
+	return (ret);
 }
 
 /*
-- 
1.7.7.6

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [dpdk-dev] [PATCHv2 3/5] acl: New test-acl application.
  2014-05-28 19:26 [dpdk-dev] [PATCHv2 0/5] ACL library Konstantin Ananyev
  2014-05-28 19:26 ` [dpdk-dev] [PATCHv2 1/5] acl: Add ACL library (librte_acl) into DPDK Konstantin Ananyev
  2014-05-28 19:26 ` [dpdk-dev] [PATCHv2 2/5] acl: update UT to reflect latest changes in the librte_acl Konstantin Ananyev
@ 2014-05-28 19:26 ` Konstantin Ananyev
  2014-05-28 19:26 ` [dpdk-dev] [PATCHv2 4/5] acl: New sample l3fwd-acl Konstantin Ananyev
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Konstantin Ananyev @ 2014-05-28 19:26 UTC (permalink / raw)
  To: dev, dev

Introduce test-acl:
Usage example and main test application for the ACL library.
Provides IPv4/IPv6 5-tuple classification.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 app/Makefile          |    1 +
 app/test-acl/Makefile |   45 +++
 app/test-acl/main.c   | 1029 +++++++++++++++++++++++++++++++++++++++++++++++++
 app/test-acl/main.h   |   50 +++
 4 files changed, 1125 insertions(+), 0 deletions(-)
 create mode 100644 app/test-acl/Makefile
 create mode 100644 app/test-acl/main.c
 create mode 100644 app/test-acl/main.h

diff --git a/app/Makefile b/app/Makefile
index 6267d7b..c398771 100644
--- a/app/Makefile
+++ b/app/Makefile
@@ -35,5 +35,6 @@ DIRS-$(CONFIG_RTE_APP_TEST) += test
 DIRS-$(CONFIG_RTE_TEST_PMD) += test-pmd
 DIRS-$(CONFIG_RTE_LIBRTE_CMDLINE) += cmdline_test
 DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += dump_cfg
+DIRS-$(CONFIG_RTE_LIBRTE_ACL) += test-acl
 
 include $(RTE_SDK)/mk/rte.subdir.mk
diff --git a/app/test-acl/Makefile b/app/test-acl/Makefile
new file mode 100644
index 0000000..00fa3b6
--- /dev/null
+++ b/app/test-acl/Makefile
@@ -0,0 +1,45 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+APP = testacl
+
+CFLAGS += $(WERROR_FLAGS)
+
+# all source are stored in SRCS-y
+SRCS-$(CONFIG_RTE_LIBRTE_ACL) := main.c
+
+# this application needs libraries first
+DEPDIRS-$(CONFIG_RTE_LIBRTE_ACL) += lib
+
+
+include $(RTE_SDK)/mk/rte.app.mk
diff --git a/app/test-acl/main.c b/app/test-acl/main.c
new file mode 100644
index 0000000..78d9ae5
--- /dev/null
+++ b/app/test-acl/main.c
@@ -0,0 +1,1029 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <rte_acl.h>
+#include <getopt.h>
+#include <string.h>
+
+#ifndef RTE_LIBRTE_ACL_STANDALONE
+
+#include <rte_cycles.h>
+#include <rte_per_lcore.h>
+#include <rte_lcore.h>
+#include <rte_ip.h>
+
+#define	PRINT_USAGE_START	"%s [EAL options]\n"
+
+#else
+
+#define IPv4(a, b, c, d) ((uint32_t)(((a) & 0xff) << 24) | \
+				(((b) & 0xff) << 16) |     \
+				(((c) & 0xff) << 8)  |     \
+				((d) & 0xff))
+
+#define	RTE_LCORE_FOREACH_SLAVE(x)	while (((x) = 0))
+
+#define	rte_eal_remote_launch(a, b, c)	DUMMY_MACRO
+#define	rte_eal_mp_wait_lcore()		DUMMY_MACRO
+
+#define	rte_eal_init(c, v)	(0)
+
+#define	PRINT_USAGE_START	"%s\n"
+
+#endif /*RTE_LIBRTE_ACL_STANDALONE */
+
+#include "main.h"
+
+#define GET_CB_FIELD(in, fd, base, lim, dlm)	do {            \
+	unsigned long val;                                      \
+	char *end_fld;                                          \
+	errno = 0;                                              \
+	val = strtoul((in), &end_fld, (base));                  \
+	if (errno != 0 || end_fld[0] != (dlm) || val > (lim))   \
+		return (-EINVAL);                               \
+	(fd) = (typeof(fd))val;                                 \
+	(in) = end_fld + 1;                                     \
+} while (0)
+
+#define	OPT_RULE_FILE		"rulesf"
+#define	OPT_TRACE_FILE		"tracef"
+#define	OPT_RULE_NUM		"rulenum"
+#define	OPT_TRACE_NUM		"tracenum"
+#define	OPT_TRACE_STEP		"tracestep"
+#define	OPT_SEARCH_SCALAR	"scalar"
+#define	OPT_BLD_CATEGORIES	"bldcat"
+#define	OPT_RUN_CATEGORIES	"runcat"
+#define	OPT_ITER_NUM		"iter"
+#define	OPT_VERBOSE		"verbose"
+#define	OPT_IPV6		"ipv6"
+
+#define	TRACE_DEFAULT_NUM	0x10000
+#define	TRACE_STEP_MAX		0x1000
+#define	TRACE_STEP_DEF		0x100
+
+#define	RULE_NUM		0x10000
+
+enum {
+	DUMP_NONE,
+	DUMP_SEARCH,
+	DUMP_PKT,
+	DUMP_MAX
+};
+
+static struct {
+	const char         *prgname;
+	const char         *rule_file;
+	const char         *trace_file;
+	uint32_t            bld_categories;
+	uint32_t            run_categories;
+	uint32_t            nb_rules;
+	uint32_t            nb_traces;
+	uint32_t            trace_step;
+	uint32_t            trace_sz;
+	uint32_t            iter_num;
+	uint32_t            verbose;
+	uint32_t            scalar;
+	uint32_t            used_traces;
+	void               *traces;
+	struct rte_acl_ctx *acx;
+	uint32_t			ipv6;
+} config = {
+	.bld_categories = 3,
+	.run_categories = 1,
+	.nb_rules = RULE_NUM,
+	.nb_traces = TRACE_DEFAULT_NUM,
+	.trace_step = TRACE_STEP_DEF,
+	.iter_num = 1,
+	.verbose = DUMP_MAX,
+	.ipv6 = 0
+};
+
+static struct rte_acl_param prm = {
+	.name = APP_NAME,
+	.socket_id = SOCKET_ID_ANY,
+};
+
+/*
+ * Rule and trace formats definitions.
+ */
+
+struct ipv4_5tuple {
+	uint8_t  proto;
+	uint32_t ip_src;
+	uint32_t ip_dst;
+	uint16_t port_src;
+	uint16_t port_dst;
+};
+
+enum {
+	PROTO_FIELD_IPV4,
+	SRC_FIELD_IPV4,
+	DST_FIELD_IPV4,
+	SRCP_FIELD_IPV4,
+	DSTP_FIELD_IPV4,
+	NUM_FIELDS_IPV4
+};
+
+struct rte_acl_field_def ipv4_defs[NUM_FIELDS_IPV4] = {
+	{
+		.type = RTE_ACL_FIELD_TYPE_BITMASK,
+		.size = sizeof(uint8_t),
+		.field_index = PROTO_FIELD_IPV4,
+		.input_index = RTE_ACL_IPV4VLAN_PROTO,
+		.offset = offsetof(struct ipv4_5tuple, proto),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_MASK,
+		.size = sizeof(uint32_t),
+		.field_index = SRC_FIELD_IPV4,
+		.input_index = RTE_ACL_IPV4VLAN_SRC,
+		.offset = offsetof(struct ipv4_5tuple, ip_src),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_MASK,
+		.size = sizeof(uint32_t),
+		.field_index = DST_FIELD_IPV4,
+		.input_index = RTE_ACL_IPV4VLAN_DST,
+		.offset = offsetof(struct ipv4_5tuple, ip_dst),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_RANGE,
+		.size = sizeof(uint16_t),
+		.field_index = SRCP_FIELD_IPV4,
+		.input_index = RTE_ACL_IPV4VLAN_PORTS,
+		.offset = offsetof(struct ipv4_5tuple, port_src),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_RANGE,
+		.size = sizeof(uint16_t),
+		.field_index = DSTP_FIELD_IPV4,
+		.input_index = RTE_ACL_IPV4VLAN_PORTS,
+		.offset = offsetof(struct ipv4_5tuple, port_dst),
+	},
+};
+
+#define	IPV6_ADDR_LEN	16
+#define	IPV6_ADDR_U16	(IPV6_ADDR_LEN / sizeof(uint16_t))
+#define	IPV6_ADDR_U32	(IPV6_ADDR_LEN / sizeof(uint32_t))
+
+struct ipv6_5tuple {
+	uint8_t  proto;
+	uint32_t ip_src[IPV6_ADDR_U32];
+	uint32_t ip_dst[IPV6_ADDR_U32];
+	uint16_t port_src;
+	uint16_t port_dst;
+};
+
+enum {
+	PROTO_FIELD_IPV6,
+	SRC1_FIELD_IPV6,
+	SRC2_FIELD_IPV6,
+	SRC3_FIELD_IPV6,
+	SRC4_FIELD_IPV6,
+	DST1_FIELD_IPV6,
+	DST2_FIELD_IPV6,
+	DST3_FIELD_IPV6,
+	DST4_FIELD_IPV6,
+	SRCP_FIELD_IPV6,
+	DSTP_FIELD_IPV6,
+	NUM_FIELDS_IPV6
+};
+
+struct rte_acl_field_def ipv6_defs[NUM_FIELDS_IPV6] = {
+	{
+		.type = RTE_ACL_FIELD_TYPE_BITMASK,
+		.size = sizeof(uint8_t),
+		.field_index = PROTO_FIELD_IPV6,
+		.input_index = PROTO_FIELD_IPV6,
+		.offset = offsetof(struct ipv6_5tuple, proto),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_MASK,
+		.size = sizeof(uint32_t),
+		.field_index = SRC1_FIELD_IPV6,
+		.input_index = SRC1_FIELD_IPV6,
+		.offset = offsetof(struct ipv6_5tuple, ip_src[0]),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_MASK,
+		.size = sizeof(uint32_t),
+		.field_index = SRC2_FIELD_IPV6,
+		.input_index = SRC2_FIELD_IPV6,
+		.offset = offsetof(struct ipv6_5tuple, ip_src[1]),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_MASK,
+		.size = sizeof(uint32_t),
+		.field_index = SRC3_FIELD_IPV6,
+		.input_index = SRC3_FIELD_IPV6,
+		.offset = offsetof(struct ipv6_5tuple, ip_src[2]),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_MASK,
+		.size = sizeof(uint32_t),
+		.field_index = SRC4_FIELD_IPV6,
+		.input_index = SRC4_FIELD_IPV6,
+		.offset = offsetof(struct ipv6_5tuple, ip_src[3]),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_MASK,
+		.size = sizeof(uint32_t),
+		.field_index = DST1_FIELD_IPV6,
+		.input_index = DST1_FIELD_IPV6,
+		.offset = offsetof(struct ipv6_5tuple, ip_dst[0]),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_MASK,
+		.size = sizeof(uint32_t),
+		.field_index = DST2_FIELD_IPV6,
+		.input_index = DST2_FIELD_IPV6,
+		.offset = offsetof(struct ipv6_5tuple, ip_dst[1]),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_MASK,
+		.size = sizeof(uint32_t),
+		.field_index = DST3_FIELD_IPV6,
+		.input_index = DST3_FIELD_IPV6,
+		.offset = offsetof(struct ipv6_5tuple, ip_dst[2]),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_MASK,
+		.size = sizeof(uint32_t),
+		.field_index = DST4_FIELD_IPV6,
+		.input_index = DST4_FIELD_IPV6,
+		.offset = offsetof(struct ipv6_5tuple, ip_dst[3]),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_RANGE,
+		.size = sizeof(uint16_t),
+		.field_index = SRCP_FIELD_IPV6,
+		.input_index = SRCP_FIELD_IPV6,
+		.offset = offsetof(struct ipv6_5tuple, port_src),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_RANGE,
+		.size = sizeof(uint16_t),
+		.field_index = DSTP_FIELD_IPV6,
+		.input_index = SRCP_FIELD_IPV6,
+		.offset = offsetof(struct ipv6_5tuple, port_dst),
+	},
+};
+
+
+enum {
+	CB_FLD_SRC_ADDR,
+	CB_FLD_DST_ADDR,
+	CB_FLD_SRC_PORT_LOW,
+	CB_FLD_SRC_PORT_DLM,
+	CB_FLD_SRC_PORT_HIGH,
+	CB_FLD_DST_PORT_LOW,
+	CB_FLD_DST_PORT_DLM,
+	CB_FLD_DST_PORT_HIGH,
+	CB_FLD_PROTO,
+	CB_FLD_NUM,
+};
+
+enum {
+	CB_TRC_SRC_ADDR,
+	CB_TRC_DST_ADDR,
+	CB_TRC_SRC_PORT,
+	CB_TRC_DST_PORT,
+	CB_TRC_PROTO,
+	CB_TRC_NUM,
+};
+
+RTE_ACL_RULE_DEF(acl_rule, RTE_ACL_MAX_FIELDS);
+
+static const char cb_port_delim[] = ":";
+
+static char line[LINE_MAX];
+
+#define	dump_verbose(lvl, fh, fmt, args...)	do { \
+	if ((lvl) <= (int32_t)config.verbose)        \
+		fprintf(fh, fmt, ##args);            \
+} while (0)
+
+
+/*
+ * Parse ClassBench input trace (test vectors and expected results) file.
+ * Expected format:
+ * <src_ipv4_addr> <space> <dst_ipv4_addr> <space> \
+ * <src_port> <space> <dst_port> <space> <proto>
+ */
+static int
+parse_cb_ipv4_trace(char *str, struct ipv4_5tuple *v)
+{
+	int i;
+	char *s, *sp, *in[CB_TRC_NUM];
+	static const char *dlm = " \t\n";
+
+	s = str;
+	for (i = 0; i != RTE_DIM(in); i++) {
+		if ((in[i] = strtok_r(s, dlm, &sp)) == NULL)
+			return (-EINVAL);
+		s = NULL;
+	}
+
+	GET_CB_FIELD(in[CB_TRC_SRC_ADDR], v->ip_src, 0, UINT32_MAX, 0);
+	GET_CB_FIELD(in[CB_TRC_DST_ADDR], v->ip_dst, 0, UINT32_MAX, 0);
+	GET_CB_FIELD(in[CB_TRC_SRC_PORT], v->port_src, 0, UINT16_MAX, 0);
+	GET_CB_FIELD(in[CB_TRC_DST_PORT], v->port_dst, 0, UINT16_MAX, 0);
+	GET_CB_FIELD(in[CB_TRC_PROTO], v->proto, 0, UINT8_MAX, 0);
+
+	/* convert to network byte order. */
+	v->ip_src = rte_cpu_to_be_32(v->ip_src);
+	v->ip_dst = rte_cpu_to_be_32(v->ip_dst);
+	v->port_src = rte_cpu_to_be_16(v->port_src);
+	v->port_dst = rte_cpu_to_be_16(v->port_dst);
+
+	return (0);
+}
+
+/*
+ * Parses IPV6 address, exepcts the following format:
+ * XXXX:XXXX:XXXX:XXXX:XXXX:XXXX:XXXX:XXXX (where X - is a hexedecimal digit).
+ */
+static int
+parse_ipv6_addr(const char *in, const char **end, uint32_t v[IPV6_ADDR_U32],
+	char dlm)
+{
+	uint32_t addr[IPV6_ADDR_U16];
+
+	GET_CB_FIELD(in, addr[0], 16, UINT16_MAX, ':');
+	GET_CB_FIELD(in, addr[1], 16, UINT16_MAX, ':');
+	GET_CB_FIELD(in, addr[2], 16, UINT16_MAX, ':');
+	GET_CB_FIELD(in, addr[3], 16, UINT16_MAX, ':');
+	GET_CB_FIELD(in, addr[4], 16, UINT16_MAX, ':');
+	GET_CB_FIELD(in, addr[5], 16, UINT16_MAX, ':');
+	GET_CB_FIELD(in, addr[6], 16, UINT16_MAX, ':');
+	GET_CB_FIELD(in, addr[7], 16, UINT16_MAX, dlm);
+
+	*end = in;
+
+	v[0] = (addr[0] << 16) + addr[1];
+	v[1] = (addr[2] << 16) + addr[3];
+	v[2] = (addr[4] << 16) + addr[5];
+	v[3] = (addr[6] << 16) + addr[7];
+
+	return (0);
+}
+
+static int
+parse_cb_ipv6_addr_trace(const char *in, uint32_t v[IPV6_ADDR_U32])
+{
+	int32_t rc;
+	const char *end;
+
+	if ((rc = parse_ipv6_addr(in, &end, v, 0)) != 0)
+		return (rc);
+
+	v[0] = rte_cpu_to_be_32(v[0]);
+	v[1] = rte_cpu_to_be_32(v[1]);
+	v[2] = rte_cpu_to_be_32(v[2]);
+	v[3] = rte_cpu_to_be_32(v[3]);
+
+	return (0);
+}
+
+/*
+ * Parse ClassBench input trace (test vectors and expected results) file.
+ * Expected format:
+ * <src_ipv6_addr> <space> <dst_ipv6_addr> <space> \
+ * <src_port> <space> <dst_port> <space> <proto>
+ */
+static int
+parse_cb_ipv6_trace(char *str, struct ipv6_5tuple *v)
+{
+	int32_t i, rc;
+	char *s, *sp, *in[CB_TRC_NUM];
+	static const char *dlm = " \t\n";
+
+	s = str;
+	for (i = 0; i != RTE_DIM(in); i++) {
+		if ((in[i] = strtok_r(s, dlm, &sp)) == NULL)
+			return (-EINVAL);
+		s = NULL;
+	}
+
+	/* get ip6 src address. */
+	if ((rc = parse_cb_ipv6_addr_trace(in[CB_TRC_SRC_ADDR],
+			v->ip_src)) != 0)
+		return (rc);
+
+	/* get ip6 dst address. */
+	if ((rc = parse_cb_ipv6_addr_trace(in[CB_TRC_DST_ADDR],
+			v->ip_dst)) != 0)
+		return (rc);
+
+	GET_CB_FIELD(in[CB_TRC_SRC_PORT], v->port_src, 0, UINT16_MAX, 0);
+	GET_CB_FIELD(in[CB_TRC_DST_PORT], v->port_dst, 0, UINT16_MAX, 0);
+	GET_CB_FIELD(in[CB_TRC_PROTO], v->proto, 0, UINT8_MAX, 0);
+
+	/* convert to network byte order. */
+	v->port_src = rte_cpu_to_be_16(v->port_src);
+	v->port_dst = rte_cpu_to_be_16(v->port_dst);
+
+	return (0);
+}
+
+static void
+tracef_init(void)
+{
+	static const char name[] = APP_NAME;
+	FILE *f;
+	size_t sz;
+	uint32_t n;
+	struct ipv4_5tuple *v;
+	struct ipv6_5tuple *w;
+
+	sz = config.nb_traces * (config.ipv6 ? sizeof(*w) : sizeof(*v));
+	if ((config.traces = rte_zmalloc_socket(name, sz, CACHE_LINE_SIZE,
+			SOCKET_ID_ANY)) == NULL)
+		rte_exit(EXIT_FAILURE, "Cannot allocate %zu bytes for "
+			"requested %u number of trace records\n",
+			sz, config.nb_traces);
+
+	if ((f = fopen(config.trace_file, "r")) == NULL)
+		rte_exit(-EINVAL, "failed to open file: %s\n",
+			config.trace_file);
+
+	v = config.traces;
+	w = config.traces;
+	for (n = 0; n != config.nb_traces; n++) {
+
+		if (fgets(line, sizeof(line), f) == NULL)
+			break;
+
+		if (config.ipv6) {
+			if (parse_cb_ipv6_trace(line, w + n) != 0)
+				rte_exit(EXIT_FAILURE,
+					"%s: failed to parse ipv6 trace "
+					"record at line %u\n",
+					config.trace_file, n + 1);
+		} else {
+			if (parse_cb_ipv4_trace(line, v + n) != 0)
+				rte_exit(EXIT_FAILURE,
+					"%s: failed to parse ipv4 trace "
+					"record at line %u\n",
+					config.trace_file, n + 1);
+		}
+	}
+
+	config.used_traces = n;
+	fclose(f);
+}
+
+static int
+parse_ipv6_net(const char *in, struct rte_acl_field field[4])
+{
+	int32_t rc;
+	const char *mp;
+	uint32_t i, m, v[4];
+	const uint32_t nbu32 = sizeof(uint32_t) * CHAR_BIT;
+
+	/* get address. */
+	if ((rc = parse_ipv6_addr(in, &mp, v, '/')) != 0)
+		return (rc);
+
+	/* get mask. */
+	GET_CB_FIELD(mp, m, 0, CHAR_BIT * sizeof(v), 0);
+
+	/* put all together. */
+	for (i = 0; i != RTE_DIM(v); i++) {
+		if (m >= (i + 1) * nbu32)
+			field[i].mask_range.u32 = nbu32;
+		else
+			field[i].mask_range.u32 = m > (i * nbu32) ?
+				m - (i * 32) : 0;
+
+		field[i].value.u32 = v[i];
+	}
+
+	return (0);
+}
+
+
+static int
+parse_cb_ipv6_rule(char *str, struct acl_rule *v)
+{
+	int i, rc;
+	char *s, *sp, *in[CB_FLD_NUM];
+	static const char *dlm = " \t\n";
+
+	/*
+	 * Skip leading '@'
+	 */
+	if (strchr(str, '@') != str)
+		return (-EINVAL);
+
+	s = str + 1;
+
+	for (i = 0; i != RTE_DIM(in); i++) {
+		if ((in[i] = strtok_r(s, dlm, &sp)) == NULL)
+			return (-EINVAL);
+		s = NULL;
+	}
+
+	if ((rc = parse_ipv6_net(in[CB_FLD_SRC_ADDR],
+			v->field + SRC1_FIELD_IPV6)) != 0) {
+		RTE_LOG(ERR, TESTACL,
+			"failed to read source address/mask: %s\n",
+			in[CB_FLD_SRC_ADDR]);
+		return (rc);
+	}
+
+	if ((rc = parse_ipv6_net(in[CB_FLD_DST_ADDR],
+			v->field + DST1_FIELD_IPV6)) != 0) {
+		RTE_LOG(ERR, TESTACL,
+			"failed to read destination address/mask: %s\n",
+			in[CB_FLD_DST_ADDR]);
+		return (rc);
+	}
+
+	/* source port. */
+	GET_CB_FIELD(in[CB_FLD_SRC_PORT_LOW],
+		v->field[SRCP_FIELD_IPV6].value.u16,
+		0, UINT16_MAX, 0);
+	GET_CB_FIELD(in[CB_FLD_SRC_PORT_HIGH],
+		v->field[SRCP_FIELD_IPV6].mask_range.u16,
+		0, UINT16_MAX, 0);
+
+	if (strncmp(in[CB_FLD_SRC_PORT_DLM], cb_port_delim,
+			sizeof(cb_port_delim)) != 0)
+		return (-EINVAL);
+
+	/* destination port. */
+	GET_CB_FIELD(in[CB_FLD_DST_PORT_LOW],
+		v->field[DSTP_FIELD_IPV6].value.u16,
+		0, UINT16_MAX, 0);
+	GET_CB_FIELD(in[CB_FLD_DST_PORT_HIGH],
+		v->field[DSTP_FIELD_IPV6].mask_range.u16,
+		0, UINT16_MAX, 0);
+
+	if (strncmp(in[CB_FLD_DST_PORT_DLM], cb_port_delim,
+			sizeof(cb_port_delim)) != 0)
+		return (-EINVAL);
+
+	GET_CB_FIELD(in[CB_FLD_PROTO], v->field[PROTO_FIELD_IPV6].value.u8,
+		0, UINT8_MAX, '/');
+	GET_CB_FIELD(in[CB_FLD_PROTO], v->field[PROTO_FIELD_IPV6].mask_range.u8,
+		0, UINT8_MAX, 0);
+
+	return (0);
+}
+
+static int
+parse_ipv4_net(const char *in, uint32_t *addr, uint32_t *mask_len)
+{
+	uint8_t a, b, c, d, m;
+
+	GET_CB_FIELD(in, a, 0, UINT8_MAX, '.');
+	GET_CB_FIELD(in, b, 0, UINT8_MAX, '.');
+	GET_CB_FIELD(in, c, 0, UINT8_MAX, '.');
+	GET_CB_FIELD(in, d, 0, UINT8_MAX, '/');
+	GET_CB_FIELD(in, m, 0, sizeof(uint32_t) * CHAR_BIT, 0);
+
+	addr[0] = IPv4(a, b, c, d);
+	mask_len[0] = m;
+
+	return (0);
+}
+/*
+ * Parse ClassBench rules file.
+ * Expected format:
+ * '@'<src_ipv4_addr>'/'<masklen> <space> \
+ * <dst_ipv4_addr>'/'<masklen> <space> \
+ * <src_port_low> <space> ":" <src_port_high> <space> \
+ * <dst_port_low> <space> ":" <dst_port_high> <space> \
+ * <proto>'/'<mask>
+ */
+static int
+parse_cb_ipv4_rule(char *str, struct acl_rule *v)
+{
+	int i, rc;
+	char *s, *sp, *in[CB_FLD_NUM];
+	static const char *dlm = " \t\n";
+
+	/*
+	 * Skip leading '@'
+	 */
+	if (strchr(str, '@') != str)
+		return (-EINVAL);
+
+	s = str + 1;
+
+	for (i = 0; i != RTE_DIM(in); i++) {
+		if ((in[i] = strtok_r(s, dlm, &sp)) == NULL)
+			return (-EINVAL);
+		s = NULL;
+	}
+
+	if ((rc = parse_ipv4_net(in[CB_FLD_SRC_ADDR],
+			&v->field[SRC_FIELD_IPV4].value.u32,
+			&v->field[SRC_FIELD_IPV4].mask_range.u32)) != 0) {
+		RTE_LOG(ERR, TESTACL,
+			"failed to read source address/mask: %s\n",
+			in[CB_FLD_SRC_ADDR]);
+		return (rc);
+	}
+
+	if ((rc = parse_ipv4_net(in[CB_FLD_DST_ADDR],
+			&v->field[DST_FIELD_IPV4].value.u32,
+			&v->field[DST_FIELD_IPV4].mask_range.u32)) != 0) {
+		RTE_LOG(ERR, TESTACL,
+			"failed to read destination address/mask: %s\n",
+			in[CB_FLD_DST_ADDR]);
+		return (rc);
+	}
+
+	/* source port. */
+	GET_CB_FIELD(in[CB_FLD_SRC_PORT_LOW],
+		v->field[SRCP_FIELD_IPV4].value.u16,
+		0, UINT16_MAX, 0);
+	GET_CB_FIELD(in[CB_FLD_SRC_PORT_HIGH],
+		v->field[SRCP_FIELD_IPV4].mask_range.u16,
+		0, UINT16_MAX, 0);
+
+	if (strncmp(in[CB_FLD_SRC_PORT_DLM], cb_port_delim,
+			sizeof(cb_port_delim)) != 0)
+		return (-EINVAL);
+
+	/* destination port. */
+	GET_CB_FIELD(in[CB_FLD_DST_PORT_LOW],
+		v->field[DSTP_FIELD_IPV4].value.u16,
+		0, UINT16_MAX, 0);
+	GET_CB_FIELD(in[CB_FLD_DST_PORT_HIGH],
+		v->field[DSTP_FIELD_IPV4].mask_range.u16,
+		0, UINT16_MAX, 0);
+
+	if (strncmp(in[CB_FLD_DST_PORT_DLM], cb_port_delim,
+			sizeof(cb_port_delim)) != 0)
+		return (-EINVAL);
+
+	GET_CB_FIELD(in[CB_FLD_PROTO], v->field[PROTO_FIELD_IPV4].value.u8,
+		0, UINT8_MAX, '/');
+	GET_CB_FIELD(in[CB_FLD_PROTO], v->field[PROTO_FIELD_IPV4].mask_range.u8,
+		0, UINT8_MAX, 0);
+
+	return (0);
+}
+
+typedef int (*parse_5tuple)(char *text, struct acl_rule *rule);
+
+static int
+add_cb_rules(FILE *f, struct rte_acl_ctx *ctx)
+{
+	int rc;
+	uint32_t n;
+	struct acl_rule v;
+	parse_5tuple parser;
+
+	memset(&v, 0, sizeof(v));
+	parser = (config.ipv6 != 0) ? parse_cb_ipv6_rule : parse_cb_ipv4_rule;
+
+	for (n = 1; fgets(line, sizeof(line), f) != NULL; n++) {
+
+		if ((rc = parser(line, &v)) != 0) {
+			RTE_LOG(ERR, TESTACL, "line %u: parse_cb_ipv4vlan_rule"
+				" failed, error code: %d (%s)\n",
+				n, rc, strerror(-rc));
+			return (rc);
+		}
+
+		v.data.category_mask = LEN2MASK(RTE_ACL_MAX_CATEGORIES);
+		v.data.priority = RTE_ACL_MAX_PRIORITY - n;
+		v.data.userdata = n;
+
+		if ((rc = rte_acl_add_rules(ctx, (struct rte_acl_rule *)&v,
+				1)) != 0) {
+			RTE_LOG(ERR, TESTACL, "line %u: failed to add rules "
+				"into ACL context, error code: %d (%s)\n",
+				n, rc, strerror(-rc));
+			return (rc);
+		}
+	}
+
+	return (0);
+}
+
+static void
+acx_init(void)
+{
+	int ret;
+	FILE *f;
+	struct rte_acl_config cfg;
+
+	/* setup ACL build config. */
+	if (config.ipv6) {
+		cfg.num_fields = RTE_DIM(ipv6_defs);
+		memcpy(&cfg.defs, ipv6_defs, sizeof(ipv6_defs));
+	} else {
+		cfg.num_fields = RTE_DIM(ipv4_defs);
+		memcpy(&cfg.defs, ipv4_defs, sizeof(ipv4_defs));
+	}
+	cfg.num_categories = config.bld_categories;
+
+	/* setup ACL creation parameters. */
+	prm.rule_size = RTE_ACL_RULE_SZ(cfg.num_fields);
+	prm.max_rule_num = config.nb_rules;
+
+	if ((config.acx = rte_acl_create(&prm)) == NULL)
+		rte_exit(rte_errno, "failed to create ACL context\n");
+
+	/* add ACL rules. */
+	if ((f = fopen(config.rule_file, "r")) == NULL)
+		rte_exit(-EINVAL, "failed to open file %s\n",
+			config.rule_file);
+
+	if ((ret = add_cb_rules(f, config.acx)) != 0)
+		rte_exit(rte_errno, "failed to add rules into ACL context\n");
+
+	fclose(f);
+
+	/* perform build. */
+	ret = rte_acl_build(config.acx, &cfg);
+
+	dump_verbose(DUMP_NONE, stdout,
+		"rte_acl_build(%u) finished with %d\n",
+		config.bld_categories, ret);
+
+	rte_acl_dump(config.acx);
+
+	if (ret != 0)
+		rte_exit(ret, "failed to build search context\n");
+}
+
+static uint32_t
+search_ip5tuples_once(uint32_t categories, uint32_t step, int scalar)
+{
+	int ret;
+	uint32_t i, j, k, n, r;
+	const uint8_t *data[step], *v;
+	uint32_t results[step * categories];
+
+	v = config.traces;
+	for (i = 0; i != config.used_traces; i += n) {
+
+		n = RTE_MIN(step, config.used_traces - i);
+
+		for (j = 0; j != n; j++) {
+			data[j] = v;
+			v += config.trace_sz;
+		}
+
+		if (scalar != 0)
+			ret = rte_acl_classify_scalar(config.acx, data,
+				results, n, categories);
+
+		else
+			ret = rte_acl_classify(config.acx, data,
+				results, n, categories);
+
+		if (ret != 0)
+			rte_exit(ret, "classify for ipv%c_5tuples returns %d\n",
+				config.ipv6 ? '6' : '4', ret);
+
+		for (r = 0, j = 0; j != n; j++) {
+			for (k = 0; k != categories; k++, r++) {
+				dump_verbose(DUMP_PKT, stdout,
+					"ipv%c_5tuple: %u, category: %u, "
+					"result: %u\n",
+					config.ipv6 ? '6' : '4',
+					i + j + 1, k, results[r] - 1);
+			}
+
+		}
+	}
+
+	dump_verbose(DUMP_SEARCH, stdout,
+		"%s(%u, %u, %s) returns %u\n", __func__,
+		categories, step, scalar != 0 ? "scalar" : "sse", i);
+	return (i);
+}
+
+static int
+search_ip5tuples(__attribute__((unused)) void *arg)
+{
+	uint64_t pkt, start, tm;
+	uint32_t i, lcore;
+
+	lcore = rte_lcore_id();
+	start = rte_rdtsc();
+	pkt = 0;
+
+	for (i = 0; i != config.iter_num; i++) {
+		pkt += search_ip5tuples_once(config.run_categories,
+			config.trace_step, config.scalar);
+	}
+
+	tm = rte_rdtsc() - start;
+	dump_verbose(DUMP_NONE, stdout,
+		"%s  @lcore %u: %" PRIu32 " iterations, %" PRIu64 " pkts, %"
+		PRIu32 " categories, %" PRIu64 " cycles, %#Lf cycles/pkt\n",
+		__func__, lcore, i, pkt, config.run_categories,
+		tm, (long double)tm / pkt);
+
+	return (0);
+}
+
+static uint32_t
+get_uint32_opt(const char *opt, const char *name, uint32_t min, uint32_t max)
+{
+	unsigned long val;
+	char *end;
+
+	errno = 0;
+	val = strtoul(opt, &end, 0);
+	if (errno != 0 || end[0] != 0 || val > max || val < min)
+		rte_exit(-EINVAL, "invalid value: \"%s\" for option: %s\n",
+			opt, name);
+	return (val);
+}
+
+static void
+print_usage(const char *prgname)
+{
+	fprintf(stdout,
+		PRINT_USAGE_START
+		"--" OPT_RULE_FILE "=<rules set file>\n"
+		"[--" OPT_TRACE_FILE "=<input traces file>]\n"
+		"[--" OPT_RULE_NUM
+			"=<maximum number of rules for ACL context>]\n"
+		"[--" OPT_TRACE_NUM
+			"=<number of traces to read binary file in>]\n"
+		"[--" OPT_TRACE_STEP
+			"=<number of traces to classify per one call>]\n"
+		"[--" OPT_BLD_CATEGORIES
+			"=<number of categories to build with>]\n"
+		"[--" OPT_RUN_CATEGORIES
+			"=<number of categories to run with> "
+			"should be either 1 or multiple of %zu, "
+			"but not greater then %u]\n"
+		"[--" OPT_ITER_NUM "=<number of iterations to perform>]\n"
+		"[--" OPT_VERBOSE "=<verbose level>]\n"
+		"[--" OPT_SEARCH_SCALAR "=<use scalar version>]\n"
+		"[--" OPT_IPV6 "=<IPv6 rules and trace files>]\n",
+		prgname, RTE_ACL_RESULTS_MULTIPLIER,
+		(uint32_t)RTE_ACL_MAX_CATEGORIES);
+}
+
+static void
+dump_config(FILE *f)
+{
+	fprintf(f, "%s:\n", __func__);
+	fprintf(f, "%s:%s\n", OPT_RULE_FILE, config.rule_file);
+	fprintf(f, "%s:%s\n", OPT_TRACE_FILE, config.trace_file);
+	fprintf(f, "%s:%u\n", OPT_RULE_NUM, config.nb_rules);
+	fprintf(f, "%s:%u\n", OPT_TRACE_NUM, config.nb_traces);
+	fprintf(f, "%s:%u\n", OPT_TRACE_STEP, config.trace_step);
+	fprintf(f, "%s:%u\n", OPT_BLD_CATEGORIES, config.bld_categories);
+	fprintf(f, "%s:%u\n", OPT_RUN_CATEGORIES, config.run_categories);
+	fprintf(f, "%s:%u\n", OPT_ITER_NUM, config.iter_num);
+	fprintf(f, "%s:%u\n", OPT_VERBOSE, config.verbose);
+	fprintf(f, "%s:%u\n", OPT_SEARCH_SCALAR, config.scalar);
+	fprintf(f, "%s:%u\n", OPT_IPV6, config.ipv6);
+}
+
+static void
+check_config(void)
+{
+	if (config.rule_file == NULL) {
+		print_usage(config.prgname);
+		rte_exit(-EINVAL, "mandatory option %s is not specified\n",
+			OPT_RULE_FILE);
+	}
+}
+
+
+static void
+get_input_opts(int argc, char **argv)
+{
+	static struct option lgopts[] = {
+		{OPT_RULE_FILE, 1, 0, 0},
+		{OPT_TRACE_FILE, 1, 0, 0},
+		{OPT_TRACE_NUM, 1, 0, 0},
+		{OPT_RULE_NUM, 1, 0, 0},
+		{OPT_TRACE_STEP, 1, 0, 0},
+		{OPT_BLD_CATEGORIES, 1, 0, 0},
+		{OPT_RUN_CATEGORIES, 1, 0, 0},
+		{OPT_ITER_NUM, 1, 0, 0},
+		{OPT_VERBOSE, 1, 0, 0},
+		{OPT_SEARCH_SCALAR, 0, 0, 0},
+		{OPT_IPV6, 0, 0, 0},
+		{NULL, 0, 0, 0}
+	};
+
+	int opt, opt_idx;
+
+	while ((opt = getopt_long(argc, argv, "", lgopts,  &opt_idx)) != EOF) {
+
+		if (opt != 0) {
+			print_usage(config.prgname);
+			rte_exit(-EINVAL, "unknown option: %c", opt);
+		}
+
+		if (strcmp(lgopts[opt_idx].name, OPT_RULE_FILE) == 0) {
+			config.rule_file = optarg;
+		} else if (strcmp(lgopts[opt_idx].name, OPT_TRACE_FILE) == 0) {
+			config.trace_file = optarg;
+		} else if (strcmp(lgopts[opt_idx].name, OPT_RULE_NUM) == 0) {
+			config.nb_rules = get_uint32_opt(optarg,
+				lgopts[opt_idx].name, 1, RTE_ACL_MAX_INDEX + 1);
+		} else if (strcmp(lgopts[opt_idx].name, OPT_TRACE_NUM) == 0) {
+			config.nb_traces = get_uint32_opt(optarg,
+				lgopts[opt_idx].name, 1, UINT32_MAX);
+		} else if (strcmp(lgopts[opt_idx].name, OPT_TRACE_STEP) == 0) {
+			config.trace_step = get_uint32_opt(optarg,
+				lgopts[opt_idx].name, 1, TRACE_STEP_MAX);
+		} else if (strcmp(lgopts[opt_idx].name,
+				OPT_BLD_CATEGORIES) == 0) {
+			config.bld_categories = get_uint32_opt(optarg,
+				lgopts[opt_idx].name, 1,
+				RTE_ACL_MAX_CATEGORIES);
+		} else if (strcmp(lgopts[opt_idx].name,
+				OPT_RUN_CATEGORIES) == 0) {
+			config.run_categories = get_uint32_opt(optarg,
+				lgopts[opt_idx].name, 1,
+				RTE_ACL_MAX_CATEGORIES);
+		} else if (strcmp(lgopts[opt_idx].name, OPT_ITER_NUM) == 0) {
+			config.iter_num = get_uint32_opt(optarg,
+				lgopts[opt_idx].name, 1, UINT16_MAX);
+		} else if (strcmp(lgopts[opt_idx].name, OPT_VERBOSE) == 0) {
+			config.verbose = get_uint32_opt(optarg,
+				lgopts[opt_idx].name, DUMP_NONE, DUMP_MAX);
+		} else if (strcmp(lgopts[opt_idx].name,
+				OPT_SEARCH_SCALAR) == 0) {
+			config.scalar = 1;
+		} else if (strcmp(lgopts[opt_idx].name, OPT_IPV6) == 0) {
+			config.ipv6 = 1;
+		}
+	}
+	config.trace_sz = config.ipv6 ? sizeof(struct ipv6_5tuple) :
+						sizeof(struct ipv4_5tuple);
+
+}
+
+int
+MAIN(int argc, char **argv)
+{
+	int ret;
+	uint32_t lcore;
+
+	ret = rte_eal_init(argc, argv);
+	if (ret < 0)
+		rte_panic("Cannot init EAL\n");
+
+	argc -= ret;
+	argv += ret;
+
+	config.prgname = argv[0];
+
+	get_input_opts(argc, argv);
+	dump_config(stdout);
+	check_config();
+
+	acx_init();
+
+	if (config.trace_file != NULL)
+		tracef_init();
+
+	RTE_LCORE_FOREACH_SLAVE(lcore)
+		 rte_eal_remote_launch(search_ip5tuples, NULL, lcore);
+
+	search_ip5tuples(NULL);
+
+	rte_eal_mp_wait_lcore();
+
+	rte_acl_free(config.acx);
+	return (0);
+}
diff --git a/app/test-acl/main.h b/app/test-acl/main.h
new file mode 100644
index 0000000..cec0408
--- /dev/null
+++ b/app/test-acl/main.h
@@ -0,0 +1,50 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _MAIN_H_
+#define _MAIN_H_
+
+#ifdef RTE_EXEC_ENV_BAREMETAL
+#define MAIN _main
+#else
+#define MAIN main
+#endif
+
+#define	RTE_LOGTYPE_TESTACL	RTE_LOGTYPE_USER1
+
+#define	APP_NAME	"TESTACL"
+
+
+int MAIN(int argc, char **argv);
+
+#endif /* _MAIN_H_ */
-- 
1.7.7.6

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [dpdk-dev] [PATCHv2 4/5] acl: New sample l3fwd-acl.
  2014-05-28 19:26 [dpdk-dev] [PATCHv2 0/5] ACL library Konstantin Ananyev
                   ` (2 preceding siblings ...)
  2014-05-28 19:26 ` [dpdk-dev] [PATCHv2 3/5] acl: New test-acl application Konstantin Ananyev
@ 2014-05-28 19:26 ` Konstantin Ananyev
  2014-05-28 19:26 ` [dpdk-dev] [PATCHv2 5/5] acl: add doxygen configuration and start page Konstantin Ananyev
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Konstantin Ananyev @ 2014-05-28 19:26 UTC (permalink / raw)
  To: dev, dev

Demonstrates the use of the ACL library in the DPDK application to
implement packet classification and L3 forwarding.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 examples/Makefile           |    1 +
 examples/l3fwd-acl/Makefile |   56 ++
 examples/l3fwd-acl/main.c   | 2048 +++++++++++++++++++++++++++++++++++++++++++
 examples/l3fwd-acl/main.h   |   45 +
 4 files changed, 2150 insertions(+), 0 deletions(-)
 create mode 100644 examples/l3fwd-acl/Makefile
 create mode 100644 examples/l3fwd-acl/main.c
 create mode 100644 examples/l3fwd-acl/main.h

diff --git a/examples/Makefile b/examples/Makefile
index d6b08c2..f3d1726 100644
--- a/examples/Makefile
+++ b/examples/Makefile
@@ -64,5 +64,6 @@ DIRS-y += vhost
 DIRS-$(CONFIG_RTE_LIBRTE_XEN_DOM0) += vhost_xen
 DIRS-y += vmdq
 DIRS-y += vmdq_dcb
+DIRS-$(CONFIG_RTE_LIBRTE_ACL) += l3fwd-acl
 
 include $(RTE_SDK)/mk/rte.extsubdir.mk
diff --git a/examples/l3fwd-acl/Makefile b/examples/l3fwd-acl/Makefile
new file mode 100644
index 0000000..7ba7247
--- /dev/null
+++ b/examples/l3fwd-acl/Makefile
@@ -0,0 +1,56 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ifeq ($(RTE_SDK),)
+$(error "Please define RTE_SDK environment variable")
+endif
+
+# Default target, can be overriden by command line or environment
+RTE_TARGET ?= x86_64-default-linuxapp-gcc
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# binary name
+APP = l3fwd-acl
+
+# all source are stored in SRCS-y
+SRCS-y := main.c
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+# workaround for a gcc bug with noreturn attribute
+# http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
+ifeq ($(CONFIG_RTE_TOOLCHAIN_GCC),y)
+CFLAGS_main.o += -Wno-return-type
+endif
+
+include $(RTE_SDK)/mk/rte.extapp.mk
diff --git a/examples/l3fwd-acl/main.c b/examples/l3fwd-acl/main.c
new file mode 100644
index 0000000..782824a
--- /dev/null
+++ b/examples/l3fwd-acl/main.c
@@ -0,0 +1,2048 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <inttypes.h>
+#include <sys/types.h>
+#include <string.h>
+#include <sys/queue.h>
+#include <stdarg.h>
+#include <errno.h>
+#include <getopt.h>
+
+#include <rte_common.h>
+#include <rte_byteorder.h>
+#include <rte_log.h>
+#include <rte_memory.h>
+#include <rte_memcpy.h>
+#include <rte_memzone.h>
+#include <rte_tailq.h>
+#include <rte_eal.h>
+#include <rte_per_lcore.h>
+#include <rte_launch.h>
+#include <rte_atomic.h>
+#include <rte_cycles.h>
+#include <rte_prefetch.h>
+#include <rte_lcore.h>
+#include <rte_per_lcore.h>
+#include <rte_branch_prediction.h>
+#include <rte_interrupts.h>
+#include <rte_pci.h>
+#include <rte_random.h>
+#include <rte_debug.h>
+#include <rte_ether.h>
+#include <rte_ethdev.h>
+#include <rte_ring.h>
+#include <rte_mempool.h>
+#include <rte_mbuf.h>
+#include <rte_ip.h>
+#include <rte_tcp.h>
+#include <rte_udp.h>
+#include <rte_string_fns.h>
+#include <rte_acl.h>
+
+#include "main.h"
+
+#define DO_RFC_1812_CHECKS
+
+#define RTE_LOGTYPE_L3FWD RTE_LOGTYPE_USER1
+
+#define MAX_JUMBO_PKT_LEN  9600
+
+#define MEMPOOL_CACHE_SIZE 256
+
+#define MBUF_SIZE (2048 + sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM)
+
+/*
+ * This expression is used to calculate the number of mbufs needed depending on user input, taking
+ *  into account memory for rx and tx hardware rings, cache per lcore and mtable per port per lcore.
+ *  RTE_MAX is used to ensure that NB_MBUF never goes below a minimum value of 8192
+ */
+
+#define NB_MBUF RTE_MAX	(																	\
+				(nb_ports*nb_rx_queue*RTE_TEST_RX_DESC_DEFAULT +							\
+				nb_ports*nb_lcores*MAX_PKT_BURST +											\
+				nb_ports*n_tx_queue*RTE_TEST_TX_DESC_DEFAULT +								\
+				nb_lcores*MEMPOOL_CACHE_SIZE),												\
+				(unsigned)8192)
+
+/*
+ * RX and TX Prefetch, Host, and Write-back threshold values should be
+ * carefully set for optimal performance. Consult the network
+ * controller's datasheet and supporting DPDK documentation for guidance
+ * on how these parameters should be set.
+ */
+#define RX_PTHRESH 8 /**< Default values of RX prefetch threshold reg. */
+#define RX_HTHRESH 8 /**< Default values of RX host threshold reg. */
+#define RX_WTHRESH 4 /**< Default values of RX write-back threshold reg. */
+
+/*
+ * These default values are optimized for use with the Intel(R) 82599 10 GbE
+ * Controller and the DPDK ixgbe PMD. Consider using other values for other
+ * network controllers and/or network drivers.
+ */
+#define TX_PTHRESH 36 /**< Default values of TX prefetch threshold reg. */
+#define TX_HTHRESH 0  /**< Default values of TX host threshold reg. */
+#define TX_WTHRESH 0  /**< Default values of TX write-back threshold reg. */
+
+#define MAX_PKT_BURST 32
+#define BURST_TX_DRAIN_US 100 /* TX drain every ~100us */
+
+#define NB_SOCKETS 8
+
+/* Configure how many packets ahead to prefetch, when reading packets */
+#define PREFETCH_OFFSET	3
+
+/*
+ * Configurable number of RX/TX ring descriptors
+ */
+#define RTE_TEST_RX_DESC_DEFAULT 128
+#define RTE_TEST_TX_DESC_DEFAULT 512
+static uint16_t nb_rxd = RTE_TEST_RX_DESC_DEFAULT;
+static uint16_t nb_txd = RTE_TEST_TX_DESC_DEFAULT;
+
+/* ethernet addresses of ports */
+static struct ether_addr ports_eth_addr[RTE_MAX_ETHPORTS];
+
+/* mask of enabled ports */
+static uint32_t enabled_port_mask = 0;
+static int promiscuous_on = 0; /**< Ports set in promiscuous mode off by default. */
+static int numa_on = 1; /**< NUMA is enabled by default. */
+
+struct mbuf_table {
+	uint16_t len;
+	struct rte_mbuf *m_table[MAX_PKT_BURST];
+};
+
+struct lcore_rx_queue {
+	uint8_t port_id;
+	uint8_t queue_id;
+} __rte_cache_aligned;
+
+#define MAX_RX_QUEUE_PER_LCORE 16
+#define MAX_TX_QUEUE_PER_PORT RTE_MAX_ETHPORTS
+#define MAX_RX_QUEUE_PER_PORT 128
+
+#define MAX_LCORE_PARAMS 1024
+struct lcore_params {
+	uint8_t port_id;
+	uint8_t queue_id;
+	uint8_t lcore_id;
+} __rte_cache_aligned;
+
+static struct lcore_params lcore_params_array[MAX_LCORE_PARAMS];
+static struct lcore_params lcore_params_array_default[] = {
+	{0, 0, 2},
+	{0, 1, 2},
+	{0, 2, 2},
+	{1, 0, 2},
+	{1, 1, 2},
+	{1, 2, 2},
+	{2, 0, 2},
+	{3, 0, 3},
+	{3, 1, 3},
+};
+
+static struct lcore_params * lcore_params = lcore_params_array_default;
+static uint16_t nb_lcore_params = sizeof(lcore_params_array_default) /
+				sizeof(lcore_params_array_default[0]);
+
+static struct rte_eth_conf port_conf = {
+	.rxmode = {
+		.mq_mode	= ETH_MQ_RX_RSS,
+		.max_rx_pkt_len = ETHER_MAX_LEN,
+		.split_hdr_size = 0,
+		.header_split   = 0, /**< Header Split disabled */
+		.hw_ip_checksum = 1, /**< IP checksum offload enabled */
+		.hw_vlan_filter = 0, /**< VLAN filtering disabled */
+		.jumbo_frame    = 0, /**< Jumbo Frame Support disabled */
+		.hw_strip_crc   = 0, /**< CRC stripped by hardware */
+	},
+	.rx_adv_conf = {
+		.rss_conf = {
+			.rss_key = NULL,
+			.rss_hf = ETH_RSS_IPV4 | ETH_RSS_IPV4_TCP | ETH_RSS_IPV4_UDP
+			        | ETH_RSS_IPV6 | ETH_RSS_IPV6_EX
+			        | ETH_RSS_IPV6_TCP | ETH_RSS_IPV6_TCP_EX
+			        | ETH_RSS_IPV6_UDP | ETH_RSS_IPV6_UDP_EX,
+		},
+	},
+	.txmode = {
+		.mq_mode = ETH_MQ_TX_NONE,
+	},
+};
+
+static const struct rte_eth_rxconf rx_conf = {
+	.rx_thresh = {
+		.pthresh = RX_PTHRESH,
+		.hthresh = RX_HTHRESH,
+		.wthresh = RX_WTHRESH,
+	},
+	.rx_free_thresh = 32,
+};
+
+static const struct rte_eth_txconf tx_conf = {
+	.tx_thresh = {
+		.pthresh = TX_PTHRESH,
+		.hthresh = TX_HTHRESH,
+		.wthresh = TX_WTHRESH,
+	},
+	.tx_free_thresh = 0, /* Use PMD default values */
+	.tx_rs_thresh = 0, /* Use PMD default values */
+	.txq_flags = 0x0,
+};
+
+static struct rte_mempool * pktmbuf_pool[NB_SOCKETS];
+
+/***********************start of ACL part******************************/
+#ifdef DO_RFC_1812_CHECKS
+static inline int
+is_valid_ipv4_pkt(struct ipv4_hdr *pkt, uint32_t link_len);
+#endif
+static inline int
+send_single_packet(struct rte_mbuf *m, uint8_t port);
+
+//#define L3FWDACL_DEBUG 1
+#define MAX_ACL_RULE_NUM 100000
+#define DEFAULT_MAX_CATEGORIES 1
+#define L3FWD_ACL_IPV4_NAME "l3fwd-acl-ipv4"
+#define L3FWD_ACL_IPV6_NAME "l3fwd-acl-ipv6"
+#define ACL_LEAD_CHAR '@'
+#define ROUTE_LEAD_CHAR 'R'
+#define COMMENT_LEAD_CHAR '#'
+#define OPTION_CONFIG 	"config"
+#define OPTION_NONUMA	"no-numa"
+#define OPTION_ENBJMO	"enable-jumbo"
+#define OPTION_RULE_IPV4	"rule_ipv4"
+#define OPTION_RULE_IPV6	"rule_ipv6"
+#define OPTION_SCALAR	"scalar"
+#define ACL_DENY_SIGNATURE 0xf0000000
+#define RTE_LOGTYPE_L3FWDACL RTE_LOGTYPE_USER3
+#define acl_log(format, ...) RTE_LOG(ERR, L3FWDACL, format, ##__VA_ARGS__)
+#define uint32_t_to_char(ip, a, b, c, d) do{\
+		*a = (unsigned char)(ip >> 24 & 0xff);\
+		*b = (unsigned char)(ip >> 16 & 0xff);\
+		*c = (unsigned char)(ip >> 8 & 0xff);\
+		*d = (unsigned char)(ip & 0xff);\
+	}while(0)
+#define OFF_ETHHEAD	(sizeof(struct ether_hdr))
+#define OFF_IPV42PROTO (offsetof(struct ipv4_hdr, next_proto_id))
+#define OFF_IPV62PROTO (offsetof(struct ipv6_hdr, proto))
+#define MBUF_IPV4_2PROTO(m) (rte_pktmbuf_mtod((m), uint8_t *) + OFF_ETHHEAD + OFF_IPV42PROTO)
+#define MBUF_IPV6_2PROTO(m) (rte_pktmbuf_mtod((m), uint8_t *) + OFF_ETHHEAD + OFF_IPV62PROTO)
+
+#define GET_CB_FIELD(in, fd, base, lim, dlm)	do {            \
+	unsigned long val;                                      \
+	char *end;                                              \
+	errno = 0;                                              \
+	val = strtoul((in), &end, (base));                      \
+	if (errno != 0 || end[0] != (dlm) || val > (lim))       \
+		return (-EINVAL);                               \
+	(fd) = (typeof (fd))val;                                \
+	(in) = end + 1;                                         \
+} while (0)
+
+#define CLASSIFY(context, data, res, num, cat) do {		\
+	if(scalar)						\
+		rte_acl_classify_scalar((context), (data),	\
+		(res), (num), (cat));				\
+	else							\
+		rte_acl_classify((context), (data),		\
+		(res), (num), (cat));				\
+} while (0)
+
+/*
+  * ACL rules  should have higher priorities than route ones to ensure ACL rule
+  * always be found when input packets have multi-matches in the database.
+  * A exception case is performance measure, which can define route rules with
+  * higher priority and route rules will always be returned in each lookup.
+  * Reserve range from ACL_RULE_PRIORITY_MAX + 1 to
+  * RTE_ACL_MAX_PRIORITY for route entries in performance measure
+  */
+#define ACL_RULE_PRIORITY_MAX 0x10000000
+
+/*
+  * Forward port info save in ACL lib starts from 1 since ACL assume 0 is invalid.
+  * So, need add 1 when saving and minus 1 when forwarding packets
+  */
+#define FWD_PORT_SHIFT 1
+
+/*
+ * Rule and trace formats definitions.
+ */
+
+enum {
+	PROTO_FIELD_IPV4,
+	SRC_FIELD_IPV4,
+	DST_FIELD_IPV4,
+	SRCP_FIELD_IPV4,
+	DSTP_FIELD_IPV4,
+	NUM_FIELDS_IPV4
+};
+
+struct rte_acl_field_def ipv4_defs[NUM_FIELDS_IPV4] = {
+	{
+		.type = RTE_ACL_FIELD_TYPE_BITMASK,
+		.size = sizeof (uint8_t),
+		.field_index = PROTO_FIELD_IPV4,
+		.input_index = RTE_ACL_IPV4VLAN_PROTO,
+		.offset = 0,
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_MASK,
+		.size = sizeof (uint32_t),
+		.field_index = SRC_FIELD_IPV4,
+		.input_index = RTE_ACL_IPV4VLAN_SRC,
+		.offset = offsetof(struct ipv4_hdr, src_addr)
+				- offsetof(struct ipv4_hdr, next_proto_id),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_MASK,
+		.size = sizeof (uint32_t),
+		.field_index = DST_FIELD_IPV4,
+		.input_index = RTE_ACL_IPV4VLAN_DST,
+		.offset = offsetof(struct ipv4_hdr, dst_addr)
+				- offsetof(struct ipv4_hdr, next_proto_id),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_RANGE,
+		.size = sizeof (uint16_t),
+		.field_index = SRCP_FIELD_IPV4,
+		.input_index = RTE_ACL_IPV4VLAN_PORTS,
+		.offset = sizeof(struct ipv4_hdr)
+				- offsetof(struct ipv4_hdr, next_proto_id),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_RANGE,
+		.size = sizeof (uint16_t),
+		.field_index = DSTP_FIELD_IPV4,
+		.input_index = RTE_ACL_IPV4VLAN_PORTS,
+		.offset = sizeof(struct ipv4_hdr)
+				- offsetof(struct ipv4_hdr, next_proto_id) + sizeof (uint16_t),
+	},
+};
+
+#define	IPV6_ADDR_LEN	16
+#define	IPV6_ADDR_U16	(IPV6_ADDR_LEN / sizeof(uint16_t))
+#define	IPV6_ADDR_U32	(IPV6_ADDR_LEN / sizeof(uint32_t))
+
+enum {
+	PROTO_FIELD_IPV6,
+	SRC1_FIELD_IPV6,
+	SRC2_FIELD_IPV6,
+	SRC3_FIELD_IPV6,
+	SRC4_FIELD_IPV6,
+	DST1_FIELD_IPV6,
+	DST2_FIELD_IPV6,
+	DST3_FIELD_IPV6,
+	DST4_FIELD_IPV6,
+	SRCP_FIELD_IPV6,
+	DSTP_FIELD_IPV6,
+	NUM_FIELDS_IPV6
+};
+
+struct rte_acl_field_def ipv6_defs[NUM_FIELDS_IPV6] = {
+	{
+		.type = RTE_ACL_FIELD_TYPE_BITMASK,
+		.size = sizeof (uint8_t),
+		.field_index = PROTO_FIELD_IPV6,
+		.input_index = PROTO_FIELD_IPV6,
+		.offset = 0,
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_MASK,
+		.size = sizeof (uint32_t),
+		.field_index = SRC1_FIELD_IPV6,
+		.input_index = SRC1_FIELD_IPV6,
+		.offset = offsetof(struct ipv6_hdr, src_addr)
+				- offsetof(struct ipv6_hdr, proto),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_MASK,
+		.size = sizeof (uint32_t),
+		.field_index = SRC2_FIELD_IPV6,
+		.input_index = SRC2_FIELD_IPV6,
+		.offset = offsetof(struct ipv6_hdr, src_addr)
+				- offsetof(struct ipv6_hdr, proto) + sizeof(uint32_t),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_MASK,
+		.size = sizeof (uint32_t),
+		.field_index = SRC3_FIELD_IPV6,
+		.input_index = SRC3_FIELD_IPV6,
+		.offset = offsetof(struct ipv6_hdr, src_addr)
+				- offsetof(struct ipv6_hdr, proto) + 2 * sizeof(uint32_t),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_MASK,
+		.size = sizeof (uint32_t),
+		.field_index = SRC4_FIELD_IPV6,
+		.input_index = SRC4_FIELD_IPV6,
+		.offset = offsetof(struct ipv6_hdr, src_addr)
+				- offsetof(struct ipv6_hdr, proto) + 3 * sizeof(uint32_t),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_MASK,
+		.size = sizeof (uint32_t),
+		.field_index = DST1_FIELD_IPV6,
+		.input_index = DST1_FIELD_IPV6,
+		.offset = offsetof(struct ipv6_hdr, dst_addr)
+				- offsetof(struct ipv6_hdr, proto),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_MASK,
+		.size = sizeof (uint32_t),
+		.field_index = DST2_FIELD_IPV6,
+		.input_index = DST2_FIELD_IPV6,
+		.offset = offsetof(struct ipv6_hdr, dst_addr)
+				- offsetof(struct ipv6_hdr, proto) + sizeof(uint32_t),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_MASK,
+		.size = sizeof (uint32_t),
+		.field_index = DST3_FIELD_IPV6,
+		.input_index = DST3_FIELD_IPV6,
+		.offset = offsetof(struct ipv6_hdr, dst_addr)
+				- offsetof(struct ipv6_hdr, proto) + 2 * sizeof(uint32_t),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_MASK,
+		.size = sizeof (uint32_t),
+		.field_index = DST4_FIELD_IPV6,
+		.input_index = DST4_FIELD_IPV6,
+		.offset = offsetof(struct ipv6_hdr, dst_addr)
+				- offsetof(struct ipv6_hdr, proto) + 3 * sizeof(uint32_t),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_RANGE,
+		.size = sizeof (uint16_t),
+		.field_index = SRCP_FIELD_IPV6,
+		.input_index = SRCP_FIELD_IPV6,
+		.offset = sizeof(struct ipv6_hdr)
+				- offsetof(struct ipv6_hdr, proto),
+	},
+	{
+		.type = RTE_ACL_FIELD_TYPE_RANGE,
+		.size = sizeof (uint16_t),
+		.field_index = DSTP_FIELD_IPV6,
+		.input_index = SRCP_FIELD_IPV6,
+		.offset = sizeof(struct ipv6_hdr)
+				- offsetof(struct ipv6_hdr, proto) + sizeof (uint16_t),
+	},
+};
+
+enum {
+	CB_FLD_SRC_ADDR,
+	CB_FLD_DST_ADDR,
+	CB_FLD_SRC_PORT_LOW,
+	CB_FLD_SRC_PORT_DLM,
+	CB_FLD_SRC_PORT_HIGH,
+	CB_FLD_DST_PORT_LOW,
+	CB_FLD_DST_PORT_DLM,
+	CB_FLD_DST_PORT_HIGH,
+	CB_FLD_PROTO,
+	CB_FLD_USERDATA,
+	CB_FLD_NUM,
+};
+
+RTE_ACL_RULE_DEF(acl4_rule, RTE_DIM(ipv4_defs));
+RTE_ACL_RULE_DEF(acl6_rule, RTE_DIM(ipv6_defs));
+
+struct acl_search_t{
+	const uint8_t * data_ipv4[MAX_PKT_BURST];
+	struct rte_mbuf *m_ipv4[MAX_PKT_BURST];
+	uint32_t res_ipv4[MAX_PKT_BURST];
+	int num_ipv4;
+
+	const uint8_t * data_ipv6[MAX_PKT_BURST];
+	struct rte_mbuf *m_ipv6[MAX_PKT_BURST];
+	uint32_t res_ipv6[MAX_PKT_BURST];
+	int num_ipv6;
+};
+
+static struct {
+	char mapped[NB_SOCKETS];
+	struct rte_acl_ctx *acx_ipv4[NB_SOCKETS];
+	struct rte_acl_ctx *acx_ipv6[NB_SOCKETS];
+#ifdef L3FWDACL_DEBUG
+	struct acl4_rule *rule_ipv4;
+	struct acl6_rule *rule_ipv6;
+#endif
+} acl_config;
+
+static struct{
+	const char *rule_ipv4_name;
+	const char *rule_ipv6_name;
+	int scalar;
+}parm_config;
+
+const char cb_port_delim[] = ":";
+
+static inline void
+print_one_ipv4_rule(struct acl4_rule *rule, int extra)
+{
+	unsigned char a, b, c, d;
+
+	uint32_t_to_char(rule->field[SRC_FIELD_IPV4].value.u32,
+			&a, &b, &c, &d);
+	printf("%hhu.%hhu.%hhu.%hhu/%u ", a, b, c, d,
+			rule->field[SRC_FIELD_IPV4].mask_range.u32);
+	uint32_t_to_char(rule->field[DST_FIELD_IPV4].value.u32,
+			&a, &b, &c, &d);
+	printf("%hhu.%hhu.%hhu.%hhu/%u ", a, b, c, d,
+			rule->field[DST_FIELD_IPV4].mask_range.u32);
+	printf("%hu : %hu %hu : %hu 0x%hhx/0x%hhx ",
+		rule->field[SRCP_FIELD_IPV4].value.u16,
+		rule->field[SRCP_FIELD_IPV4].mask_range.u16,
+		rule->field[DSTP_FIELD_IPV4].value.u16,
+		rule->field[DSTP_FIELD_IPV4].mask_range.u16,
+		rule->field[PROTO_FIELD_IPV4].value.u8,
+		rule->field[PROTO_FIELD_IPV4].mask_range.u8);
+	if(extra)
+		printf("0x%x-0x%x-0x%x ",
+			rule->data.category_mask,
+			rule->data.priority,
+			rule->data.userdata);
+}
+
+static inline void
+print_one_ipv6_rule(struct acl6_rule *rule, int extra)
+{
+	unsigned char a, b, c, d;
+
+	uint32_t_to_char(rule->field[SRC1_FIELD_IPV6].value.u32, &a, &b, &c, &d);
+	printf("%.2x%.2x:%.2x%.2x", a, b, c, d);
+	uint32_t_to_char(rule->field[SRC2_FIELD_IPV6].value.u32, &a, &b, &c, &d);
+	printf(":%.2x%.2x:%.2x%.2x", a, b, c, d);
+	uint32_t_to_char(rule->field[SRC3_FIELD_IPV6].value.u32, &a, &b, &c, &d);
+	printf(":%.2x%.2x:%.2x%.2x", a, b, c, d);
+	uint32_t_to_char(rule->field[SRC4_FIELD_IPV6].value.u32, &a, &b, &c, &d);
+	printf(":%.2x%.2x:%.2x%.2x/%u ", a, b, c, d,
+			rule->field[SRC1_FIELD_IPV6].mask_range.u32
+			+ rule->field[SRC2_FIELD_IPV6].mask_range.u32
+			+ rule->field[SRC3_FIELD_IPV6].mask_range.u32
+			+ rule->field[SRC4_FIELD_IPV6].mask_range.u32);
+
+	uint32_t_to_char(rule->field[DST1_FIELD_IPV6].value.u32, &a, &b, &c, &d);
+	printf("%.2x%.2x:%.2x%.2x", a, b, c, d);
+	uint32_t_to_char(rule->field[DST2_FIELD_IPV6].value.u32, &a, &b, &c, &d);
+	printf(":%.2x%.2x:%.2x%.2x", a, b, c, d);
+	uint32_t_to_char(rule->field[DST3_FIELD_IPV6].value.u32, &a, &b, &c, &d);
+	printf(":%.2x%.2x:%.2x%.2x", a, b, c, d);
+	uint32_t_to_char(rule->field[DST4_FIELD_IPV6].value.u32, &a, &b, &c, &d);
+	printf(":%.2x%.2x:%.2x%.2x/%u ", a, b, c, d,
+			rule->field[DST1_FIELD_IPV6].mask_range.u32
+			+ rule->field[DST2_FIELD_IPV6].mask_range.u32
+			+ rule->field[DST3_FIELD_IPV6].mask_range.u32
+			+ rule->field[DST4_FIELD_IPV6].mask_range.u32);
+
+	printf("%hu : %hu %hu : %hu 0x%hhx/0x%hhx ",
+		rule->field[SRCP_FIELD_IPV6].value.u16,
+		rule->field[SRCP_FIELD_IPV6].mask_range.u16,
+		rule->field[DSTP_FIELD_IPV6].value.u16,
+		rule->field[DSTP_FIELD_IPV6].mask_range.u16,
+		rule->field[PROTO_FIELD_IPV6].value.u8,
+		rule->field[PROTO_FIELD_IPV6].mask_range.u8);
+	if(extra)
+		printf("0x%x-0x%x-0x%x ",
+			rule->data.category_mask,
+			rule->data.priority,
+			rule->data.userdata);
+}
+
+/* Bypass comment and empty lines */
+static inline int
+is_bypass_line(char *buff)
+{
+	int i = 0;
+
+	/* comment line */
+	if( buff[0] == COMMENT_LEAD_CHAR)
+		return 1;
+	/* empty line */
+	while(buff[i] != '\0'){
+		if(!isspace(buff[i]))
+			return 0;
+		i++;
+	}
+	return 1;
+}
+
+#ifdef L3FWDACL_DEBUG
+static inline void
+dump_acl4_rule(struct rte_mbuf *m, uint32_t sig)
+{
+	uint32_t offset = sig & ~ACL_DENY_SIGNATURE;
+	unsigned char a, b, c, d;
+	struct ipv4_hdr *ipv4_hdr = (struct ipv4_hdr *)
+					(rte_pktmbuf_mtod(m, unsigned char *) +
+					sizeof(struct ether_hdr));
+
+	uint32_t_to_char(rte_bswap32(ipv4_hdr->src_addr), &a, &b, &c, &d);
+	printf("Packet Src:%hhu.%hhu.%hhu.%hhu ", a, b, c, d);
+	uint32_t_to_char(rte_bswap32(ipv4_hdr->dst_addr), &a, &b, &c, &d);
+	printf("Dst:%hhu.%hhu.%hhu.%hhu ", a, b, c, d);
+
+	printf("Src port:%hu,Dst port:%hu ",
+			rte_bswap16(*(uint16_t *)(ipv4_hdr + 1)),
+			rte_bswap16(*((uint16_t *)(ipv4_hdr + 1) + 1)));
+	printf("hit ACL %d - ", offset);
+
+	print_one_ipv4_rule(acl_config.rule_ipv4 + offset, 1);
+
+	printf("\n\n");
+}
+
+static inline void
+dump_acl6_rule(struct rte_mbuf *m, uint32_t sig)
+{
+	unsigned i;
+	uint32_t offset = sig & ~ACL_DENY_SIGNATURE;
+	struct ipv6_hdr *ipv6_hdr = (struct ipv6_hdr *)
+					(rte_pktmbuf_mtod(m, unsigned char *) +
+					sizeof(struct ether_hdr));
+
+	printf("Packet Src");
+	for (i = 0; i < RTE_DIM(ipv6_hdr->src_addr); i += sizeof (uint16_t))
+		printf(":%.2x%.2x", ipv6_hdr->src_addr[i], ipv6_hdr->src_addr[i + 1]);
+
+	printf("\nDst");
+	for (i = 0; i < RTE_DIM(ipv6_hdr->dst_addr); i += sizeof (uint16_t))
+		printf(":%.2x%.2x", ipv6_hdr->dst_addr[i], ipv6_hdr->dst_addr[i + 1]);
+
+	printf("\nSrc port:%hu,Dst port:%hu ",
+			rte_bswap16(*(uint16_t *)(ipv6_hdr + 1)),
+			rte_bswap16(*((uint16_t *)(ipv6_hdr + 1) + 1)));
+	printf("hit ACL %d - ", offset);
+
+	print_one_ipv6_rule(acl_config.rule_ipv6 + offset, 1);
+
+	printf("\n\n");
+}
+#endif /* L3FWDACL_DEBUG */
+
+static inline void
+dump_ipv4_rules(struct acl4_rule *rule, int num, int extra)
+{
+	int i;
+
+	for(i = 0; i < num; i++, rule++){
+		printf("\t%d:", i + 1);
+		print_one_ipv4_rule(rule, extra);
+		printf("\n");
+	}
+}
+
+static inline void
+dump_ipv6_rules(struct acl6_rule *rule, int num, int extra)
+{
+	int i;
+
+	for(i = 0; i < num; i++, rule++){
+		printf("\t%d:", i + 1);
+		print_one_ipv6_rule(rule, extra);
+		printf("\n");
+	}
+}
+
+#ifdef DO_RFC_1812_CHECKS
+static inline void
+prepare_one_packet(struct rte_mbuf **pkts_in, struct acl_search_t *acl,
+	int index)
+{
+	struct ipv4_hdr *ipv4_hdr;
+	struct rte_mbuf *pkt = pkts_in[index];
+
+	int type = pkt->ol_flags & (PKT_RX_IPV4_HDR | PKT_RX_IPV6_HDR);
+
+	if (type == PKT_RX_IPV4_HDR) {
+
+		ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt,
+			unsigned char *) + sizeof(struct ether_hdr));
+
+		/* Check to make sure the packet is valid (RFC1812) */
+		if (is_valid_ipv4_pkt(ipv4_hdr, pkt->pkt.pkt_len) >= 0) {
+
+			/* Update time to live and header checksum */
+			--(ipv4_hdr->time_to_live);
+			++(ipv4_hdr->hdr_checksum);
+
+			/* Fill acl structure */
+			acl->data_ipv4[acl->num_ipv4] = MBUF_IPV4_2PROTO(pkt);
+			acl->m_ipv4[(acl->num_ipv4)++] = pkt;
+
+		} else {
+			/* Not a valid IPv4 packet */
+			rte_pktmbuf_free(pkt);
+		}
+
+	} else if (type == PKT_RX_IPV6_HDR) {
+
+		/* Fill acl structure */
+		acl->data_ipv6[acl->num_ipv6] = MBUF_IPV6_2PROTO(pkt);
+		acl->m_ipv6[(acl->num_ipv6)++] = pkt;
+
+	} else {
+		/* Unknown type, drop the packet */
+		rte_pktmbuf_free(pkt);
+	}
+}
+
+#else
+static inline void
+prepare_one_packet(struct rte_mbuf **pkts_in, struct acl_search_t *acl, int index)
+{
+	struct rte_mbuf *pkt = pkts_in[index];
+
+	int type = pkt->ol_flags & (PKT_RX_IPV4_HDR | PKT_RX_IPV6_HDR);
+
+	if (type == PKT_RX_IPV4_HDR) {
+
+		/* Fill acl structure */
+		acl->data_ipv4[acl->num_ipv4] = MBUF_IPV4_2PROTO(pkt);
+		acl->m_ipv4[(acl->num_ipv4)++] = pkt;
+
+
+	} else if (type == PKT_RX_IPV6_HDR) {
+
+		/* Fill acl structure */
+		acl->data_ipv6[acl->num_ipv6] = MBUF_IPV6_2PROTO(pkt);
+		acl->m_ipv6[(acl->num_ipv6)++] = pkt;
+	} else {
+		/* Unknown type, drop the packet */
+		rte_pktmbuf_free(pkt);
+	}
+}
+#endif /* DO_RFC_1812_CHECKS */
+
+static inline void
+prepare_acl_parameter(struct rte_mbuf **pkts_in, struct acl_search_t *acl, int nb_rx)
+{
+	int i;
+
+	acl->num_ipv4 = 0;
+	acl->num_ipv6 = 0;
+
+	/* Prefetch first packets */
+	for (i = 0; i < PREFETCH_OFFSET && i < nb_rx; i++) {
+		rte_prefetch0(rte_pktmbuf_mtod(
+				pkts_in[i], void *));
+	}
+
+	for(i = 0; i < (nb_rx - PREFETCH_OFFSET); i++) {
+		rte_prefetch0(rte_pktmbuf_mtod(pkts_in[
+				i + PREFETCH_OFFSET], void *));
+		prepare_one_packet(pkts_in, acl, i);
+	}
+
+	/* Process left packets */
+	for (; i < nb_rx; i++)
+		prepare_one_packet(pkts_in, acl, i);
+}
+
+static inline void
+send_one_packet(struct rte_mbuf *m, uint32_t res)
+{
+	if(likely((res & ACL_DENY_SIGNATURE) == 0 && res != 0)) {
+		/* forward packets */
+		send_single_packet(m,
+			(uint8_t)(res - FWD_PORT_SHIFT));
+	}
+	else{
+		/* in the ACL list, drop it */
+#ifdef L3FWDACL_DEBUG
+		if((res & ACL_DENY_SIGNATURE) != 0) {
+			if (m->ol_flags & PKT_RX_IPV4_HDR)
+				dump_acl4_rule(m, res);
+			else
+				dump_acl6_rule(m, res);
+		}
+#endif
+		rte_pktmbuf_free(m);
+	}
+}
+
+
+
+static inline void
+send_packets(struct rte_mbuf **m, uint32_t *res, int num)
+{
+	int i;
+
+	/* Prefetch first packets */
+	for (i = 0; i < PREFETCH_OFFSET && i < num; i++) {
+		rte_prefetch0(rte_pktmbuf_mtod(
+				m[i], void *));
+	}
+
+	for(i = 0; i < (num - PREFETCH_OFFSET); i++){
+		rte_prefetch0(rte_pktmbuf_mtod(m[
+				i + PREFETCH_OFFSET], void *));
+		send_one_packet(m[i], res[i]);
+	}
+
+	/* Process left packets */
+	for (; i < num; i++) {
+		send_one_packet(m[i], res[i]);
+	}
+
+}
+
+/*
+ * Parses IPV6 address, exepcts the following format:
+ * XXXX:XXXX:XXXX:XXXX:XXXX:XXXX:XXXX:XXXX (where X - is a hexedecimal digit).
+ */
+static int
+parse_ipv6_addr(const char *in, const char **end, uint32_t v[IPV6_ADDR_U32],
+	char dlm)
+{
+	uint32_t addr[IPV6_ADDR_U16];
+
+	GET_CB_FIELD(in, addr[0], 16, UINT16_MAX, ':');
+	GET_CB_FIELD(in, addr[1], 16, UINT16_MAX, ':');
+	GET_CB_FIELD(in, addr[2], 16, UINT16_MAX, ':');
+	GET_CB_FIELD(in, addr[3], 16, UINT16_MAX, ':');
+	GET_CB_FIELD(in, addr[4], 16, UINT16_MAX, ':');
+	GET_CB_FIELD(in, addr[5], 16, UINT16_MAX, ':');
+	GET_CB_FIELD(in, addr[6], 16, UINT16_MAX, ':');
+	GET_CB_FIELD(in, addr[7], 16, UINT16_MAX, dlm);
+
+	*end = in;
+
+	v[0] = (addr[0] << 16) + addr[1];
+	v[1] = (addr[2] << 16) + addr[3];
+	v[2] = (addr[4] << 16) + addr[5];
+	v[3] = (addr[6] << 16) + addr[7];
+
+	return (0);
+}
+
+static int
+parse_ipv6_net(const char *in, struct rte_acl_field field[4])
+{
+	int32_t rc;
+	const char *mp;
+	uint32_t i, m, v[4];
+	const uint32_t nbu32 = sizeof (uint32_t) * CHAR_BIT;
+
+	/* get address. */
+	if ((rc = parse_ipv6_addr(in, &mp, v, '/')) != 0)
+		return (rc);
+
+	/* get mask. */
+	GET_CB_FIELD(mp, m, 0, CHAR_BIT * sizeof (v), 0);
+
+	/* put all together. */
+	for (i = 0; i != RTE_DIM(v); i++) {
+		if (m >= (i + 1) * nbu32)
+			field[i].mask_range.u32 = nbu32;
+		else
+			field[i].mask_range.u32 = m > (i * nbu32) ?
+				m - (i * 32) : 0;
+
+		field[i].value.u32 = v[i];
+	}
+
+	return (0);
+}
+
+static int
+parse_cb_ipv6_rule(char *str, struct rte_acl_rule *v, int has_userdata)
+{
+	int i, rc;
+	char *s, *sp, *in[CB_FLD_NUM];
+	static const char *dlm = " \t\n";
+	int dim = has_userdata ? CB_FLD_NUM : CB_FLD_USERDATA;
+	s = str;
+
+	for (i = 0; i != dim; i++, s = NULL) {
+		if ((in[i] = strtok_r(s, dlm, &sp)) == NULL)
+			return (-EINVAL);
+	}
+
+	if ((rc = parse_ipv6_net(in[CB_FLD_SRC_ADDR],
+			v->field + SRC1_FIELD_IPV6)) != 0) {
+		acl_log("failed to read source address/mask: %s\n",
+			in[CB_FLD_SRC_ADDR]);
+		return (rc);
+	}
+
+	if ((rc = parse_ipv6_net(in[CB_FLD_DST_ADDR],
+			v->field + DST1_FIELD_IPV6)) != 0) {
+		acl_log("failed to read destination address/mask: %s\n",
+			in[CB_FLD_DST_ADDR]);
+		return (rc);
+	}
+
+	/* source port. */
+	GET_CB_FIELD(in[CB_FLD_SRC_PORT_LOW],
+		v->field[SRCP_FIELD_IPV6].value.u16,
+		0, UINT16_MAX, 0);
+	GET_CB_FIELD(in[CB_FLD_SRC_PORT_HIGH],
+		v->field[SRCP_FIELD_IPV6].mask_range.u16,
+		0, UINT16_MAX, 0);
+
+	if (strncmp(in[CB_FLD_SRC_PORT_DLM], cb_port_delim,
+			sizeof (cb_port_delim)) != 0)
+		return (-EINVAL);
+
+	/* destination port. */
+	GET_CB_FIELD(in[CB_FLD_DST_PORT_LOW],
+		v->field[DSTP_FIELD_IPV6].value.u16,
+		0, UINT16_MAX, 0);
+	GET_CB_FIELD(in[CB_FLD_DST_PORT_HIGH],
+		v->field[DSTP_FIELD_IPV6].mask_range.u16,
+		0, UINT16_MAX, 0);
+
+	if (strncmp(in[CB_FLD_DST_PORT_DLM], cb_port_delim,
+			sizeof (cb_port_delim)) != 0)
+		return (-EINVAL);
+
+	if (v->field[SRCP_FIELD_IPV6].mask_range.u16
+			< v->field[SRCP_FIELD_IPV6].value.u16
+			|| v->field[DSTP_FIELD_IPV6].mask_range.u16
+			< v->field[DSTP_FIELD_IPV6].value.u16)
+		return (-EINVAL);
+
+	GET_CB_FIELD(in[CB_FLD_PROTO], v->field[PROTO_FIELD_IPV6].value.u8,
+		0, UINT8_MAX, '/');
+	GET_CB_FIELD(in[CB_FLD_PROTO], v->field[PROTO_FIELD_IPV6].mask_range.u8,
+		0, UINT8_MAX, 0);
+
+	if(has_userdata)
+		GET_CB_FIELD(in[CB_FLD_USERDATA], v->data.userdata, 0, UINT32_MAX, 0);
+
+	return (0);
+}
+
+/*
+ * Parse ClassBench rules file.
+ * Expected format:
+ * '@'<src_ipv4_addr>'/'<masklen> <space> \
+ * <dst_ipv4_addr>'/'<masklen> <space> \
+ * <src_port_low> <space> ":" <src_port_high> <space> \
+ * <dst_port_low> <space> ":" <dst_port_high> <space> \
+ * <proto>'/'<mask>
+ */
+static int
+parse_ipv4_net(const char *in, uint32_t *addr, uint32_t *mask_len)
+{
+	uint8_t a, b, c, d, m;
+
+	GET_CB_FIELD(in, a, 0, UINT8_MAX, '.');
+	GET_CB_FIELD(in, b, 0, UINT8_MAX, '.');
+	GET_CB_FIELD(in, c, 0, UINT8_MAX, '.');
+	GET_CB_FIELD(in, d, 0, UINT8_MAX, '/');
+	GET_CB_FIELD(in, m, 0, sizeof (uint32_t) * CHAR_BIT, 0);
+
+	addr[0] = IPv4(a, b, c, d);
+	mask_len[0] = m;
+
+	return (0);
+}
+
+static int
+parse_cb_ipv4vlan_rule(char *str, struct rte_acl_rule *v, int has_userdata)
+{
+	int i, rc;
+	char *s, *sp, *in[CB_FLD_NUM];
+	static const char *dlm = " \t\n";
+	int dim = has_userdata ? CB_FLD_NUM : CB_FLD_USERDATA;
+	s = str;
+	for (i = 0; i != dim; i++, s = NULL) {
+		if ((in[i] = strtok_r(s, dlm, &sp)) == NULL)
+			return (-EINVAL);
+	}
+
+	if ((rc = parse_ipv4_net(in[CB_FLD_SRC_ADDR],
+			&v->field[SRC_FIELD_IPV4].value.u32,
+			&v->field[SRC_FIELD_IPV4].mask_range.u32)) != 0) {
+			acl_log("failed to read source address/mask: %s\n",
+			in[CB_FLD_SRC_ADDR]);
+		return (rc);
+	}
+
+	if ((rc = parse_ipv4_net(in[CB_FLD_DST_ADDR],
+			&v->field[DST_FIELD_IPV4].value.u32,
+			&v->field[DST_FIELD_IPV4].mask_range.u32)) != 0) {
+		acl_log("failed to read destination address/mask: %s\n",
+			in[CB_FLD_DST_ADDR]);
+		return (rc);
+	}
+
+	GET_CB_FIELD(in[CB_FLD_SRC_PORT_LOW],
+		v->field[SRCP_FIELD_IPV4].value.u16,
+		0, UINT16_MAX, 0);
+	GET_CB_FIELD(in[CB_FLD_SRC_PORT_HIGH],
+		v->field[SRCP_FIELD_IPV4].mask_range.u16,
+		0, UINT16_MAX, 0);
+
+	if (strncmp(in[CB_FLD_SRC_PORT_DLM], cb_port_delim,
+			sizeof (cb_port_delim)) != 0)
+		return (-EINVAL);
+
+	GET_CB_FIELD(in[CB_FLD_DST_PORT_LOW],
+		v->field[DSTP_FIELD_IPV4].value.u16,
+		0, UINT16_MAX, 0);
+	GET_CB_FIELD(in[CB_FLD_DST_PORT_HIGH],
+		v->field[DSTP_FIELD_IPV4].mask_range.u16,
+		0, UINT16_MAX, 0);
+
+	if (strncmp(in[CB_FLD_DST_PORT_DLM], cb_port_delim,
+			sizeof (cb_port_delim)) != 0)
+		return (-EINVAL);
+
+	if (v->field[SRCP_FIELD_IPV4].mask_range.u16
+			< v->field[SRCP_FIELD_IPV4].value.u16
+			|| v->field[DSTP_FIELD_IPV4].mask_range.u16
+			< v->field[DSTP_FIELD_IPV4].value.u16)
+		return (-EINVAL);
+
+	GET_CB_FIELD(in[CB_FLD_PROTO], v->field[PROTO_FIELD_IPV4].value.u8,
+		0, UINT8_MAX, '/');
+	GET_CB_FIELD(in[CB_FLD_PROTO], v->field[PROTO_FIELD_IPV4].mask_range.u8,
+		0, UINT8_MAX, 0);
+
+	if(has_userdata)
+		GET_CB_FIELD(in[CB_FLD_USERDATA], v->data.userdata, 0, UINT32_MAX, 0);
+
+	return (0);
+}
+
+static int
+add_rules(const char *rule_path,
+		struct rte_acl_rule **proute_base,
+		unsigned int *proute_num,
+		struct rte_acl_rule **pacl_base,
+		unsigned int *pacl_num, uint32_t rule_size,
+		int (*parser)(char *, struct rte_acl_rule*, int))
+{
+	uint8_t *acl_rules, *route_rules;
+	struct rte_acl_rule *next;
+	unsigned int acl_num = 0, route_num = 0, total_num = 0;
+	unsigned int acl_cnt = 0, route_cnt = 0;
+	char buff[LINE_MAX];
+	FILE *fh = fopen(rule_path, "rb");
+	unsigned int i = 0;
+
+	if (fh == NULL)
+		rte_exit(EXIT_FAILURE, "%s: Open %s failed\n", __func__, rule_path);
+
+	while((fgets(buff, LINE_MAX, fh) != NULL)){
+		if(buff[0] == ROUTE_LEAD_CHAR)
+			route_num++;
+		else if(buff[0] == ACL_LEAD_CHAR)
+			acl_num++;
+	}
+
+	if(0 == route_num)
+		rte_exit(EXIT_FAILURE, "Not find any route entries in %s!\n",
+				rule_path);
+
+	fseek(fh, 0, SEEK_SET);
+
+	acl_rules = (uint8_t *)calloc(acl_num, rule_size);
+
+	if(NULL == acl_rules)
+		rte_exit(EXIT_FAILURE, "%s: failed to malloc memory\n", __func__);
+
+	route_rules = (uint8_t *)calloc(route_num, rule_size);
+
+	if(NULL == route_rules)
+		rte_exit(EXIT_FAILURE, "%s: failed to malloc memory\n", __func__);
+
+	i = 0;
+	while(fgets(buff, LINE_MAX, fh) != NULL){
+		i++;
+
+		if(is_bypass_line(buff))
+			continue;
+
+		char s = buff[0];
+
+		/* Route entry */
+		if(s == ROUTE_LEAD_CHAR)
+			next = (struct rte_acl_rule *)(route_rules + route_cnt * rule_size);
+
+		/* ACL entry */
+		else if(s == ACL_LEAD_CHAR)
+			next = (struct rte_acl_rule *)(acl_rules + acl_cnt * rule_size);
+
+		/* Illegal line */
+		else
+			rte_exit(EXIT_FAILURE,
+					"%s Line %u: should start with leading char %c or %c\n",
+					rule_path, i, ROUTE_LEAD_CHAR, ACL_LEAD_CHAR);
+
+		if(parser(buff + 1, next, s == ROUTE_LEAD_CHAR) != 0)
+			rte_exit(EXIT_FAILURE, "%s Line %u: parse rules error\n", rule_path, i);
+
+		if(s == ROUTE_LEAD_CHAR) {
+			/* Check the forwarding port number */
+			if((enabled_port_mask & (1 << next->data.userdata)) == 0)
+				rte_exit(EXIT_FAILURE, "%s Line %u: fwd number illegal:%u\n",
+					rule_path, i, next->data.userdata);
+			next->data.userdata += FWD_PORT_SHIFT;
+			route_cnt++;
+		} else {
+			next->data.userdata = ACL_DENY_SIGNATURE + acl_cnt;
+			acl_cnt++;
+		}
+
+		next->data.priority = RTE_ACL_MAX_PRIORITY - total_num;
+		next->data.category_mask = -1;
+		total_num++;
+	}
+
+	fclose(fh);
+
+	*pacl_base = (struct rte_acl_rule *)acl_rules;
+	*pacl_num = acl_num;
+	*proute_base = (struct rte_acl_rule *)route_rules;
+	*proute_num = route_cnt;
+
+	return 0;
+}
+
+static void
+dump_acl_config(void)
+{
+	printf("ACL option are:\n");
+	printf(OPTION_RULE_IPV4": %s\n", parm_config.rule_ipv4_name);
+	printf(OPTION_RULE_IPV6": %s\n", parm_config.rule_ipv6_name);
+	printf(OPTION_SCALAR": %d\n", parm_config.scalar);
+}
+
+static int
+check_acl_config(void)
+{
+	if(parm_config.rule_ipv4_name == NULL){
+		acl_log("ACL IPv4 rule file not specified\n");
+		return -1;
+	} else if (parm_config.rule_ipv6_name == NULL){
+		acl_log("ACL IPv6 rule file not specified\n");
+		return -1;
+	}
+
+	return 0;
+}
+
+static struct rte_acl_ctx*
+setup_acl(struct rte_acl_rule *route_base,
+		struct rte_acl_rule *acl_base, unsigned int route_num,
+		unsigned int acl_num, int ipv6, int socketid)
+{
+	char name[PATH_MAX];
+	struct rte_acl_param acl_param;
+	struct rte_acl_config acl_build_param;
+	struct rte_acl_ctx *context;
+	int dim = ipv6 ? RTE_DIM(ipv6_defs) : RTE_DIM(ipv4_defs);
+
+	/* Create ACL contexts */
+	rte_snprintf(name, sizeof(name), "%s%d",
+			ipv6 ? L3FWD_ACL_IPV6_NAME : L3FWD_ACL_IPV4_NAME,
+			socketid);
+
+	acl_param.name = name;
+	acl_param.socket_id = socketid;
+	acl_param.rule_size = RTE_ACL_RULE_SZ(dim);
+	acl_param.max_rule_num = MAX_ACL_RULE_NUM;
+
+	if ((context = rte_acl_create(&acl_param)) == NULL)
+		rte_exit(EXIT_FAILURE, "Failed to create ACL context\n");
+
+	if(rte_acl_add_rules(context, route_base, route_num) < 0)
+			rte_exit(EXIT_FAILURE, "add rules failed\n");
+
+	if(rte_acl_add_rules(context, acl_base, acl_num) < 0)
+			rte_exit(EXIT_FAILURE, "add rules failed\n");
+
+	/* Perform builds */
+	acl_build_param.num_categories = DEFAULT_MAX_CATEGORIES;
+
+	acl_build_param.num_fields = dim;
+	memcpy(&acl_build_param.defs, ipv6 ? ipv6_defs : ipv4_defs, ipv6 ? sizeof(ipv6_defs) : sizeof(ipv4_defs));
+
+	if(rte_acl_build(context, &acl_build_param) != 0)
+		rte_exit(EXIT_FAILURE, "Failed to build ACL trie\n");
+
+	rte_acl_dump(context);
+
+	return context;
+}
+
+static int
+app_acl_init(void)
+{
+	unsigned lcore_id ;
+	unsigned int i;
+	int socketid;
+	struct rte_acl_rule *acl_base_ipv4, *route_base_ipv4, *acl_base_ipv6, *route_base_ipv6;
+	unsigned int acl_num_ipv4 = 0, route_num_ipv4 = 0, acl_num_ipv6 = 0, route_num_ipv6 = 0;
+
+	if(check_acl_config() != 0)
+		rte_exit(EXIT_FAILURE, "Failed to get valid ACL options\n");
+
+	dump_acl_config();
+
+	/* Load  rules from the input file */
+	if(add_rules(parm_config.rule_ipv4_name, &route_base_ipv4, &route_num_ipv4,
+			&acl_base_ipv4, &acl_num_ipv4, sizeof(struct acl4_rule), &parse_cb_ipv4vlan_rule) < 0)
+		rte_exit(EXIT_FAILURE, "Failed to add rules\n");
+
+	acl_log("IPv4 Route entries %u:\n", route_num_ipv4);
+	dump_ipv4_rules((struct acl4_rule *)route_base_ipv4, route_num_ipv4, 1);
+
+	acl_log("IPv4 ACL entries %u:\n", acl_num_ipv4);
+	dump_ipv4_rules((struct acl4_rule *)acl_base_ipv4, acl_num_ipv4, 1);
+
+	if(add_rules(parm_config.rule_ipv6_name, &route_base_ipv6, &route_num_ipv6,
+			&acl_base_ipv6, &acl_num_ipv6, sizeof(struct acl6_rule), &parse_cb_ipv6_rule) < 0)
+		rte_exit(EXIT_FAILURE, "Failed to add rules\n");
+
+	acl_log("IPv6 Route entries %u:\n", route_num_ipv6);
+	dump_ipv6_rules((struct acl6_rule *)route_base_ipv6, route_num_ipv6, 1);
+
+	acl_log("IPv6 ACL entries %u:\n", acl_num_ipv6);
+	dump_ipv6_rules((struct acl6_rule *)acl_base_ipv6, acl_num_ipv6, 1);
+
+	memset(&acl_config, 0, sizeof(acl_config));
+
+	/* Check sockets a context should be created on */
+	if(!numa_on)
+		acl_config.mapped[0] = 1;
+	else{
+		for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++){
+			if (rte_lcore_is_enabled(lcore_id) == 0)
+				continue;
+
+			socketid = rte_lcore_to_socket_id(lcore_id);
+			if (socketid >= NB_SOCKETS){
+				acl_log("Socket %d of lcore %u is out of range %d\n",
+						socketid, lcore_id, NB_SOCKETS);
+				return -1;
+			}
+
+			acl_config.mapped[socketid] = 1;
+		}
+	}
+
+	for(i = 0; i < NB_SOCKETS; i++){
+		if(acl_config.mapped[i]){
+			acl_config.acx_ipv4[i] = setup_acl(route_base_ipv4,
+				acl_base_ipv4, route_num_ipv4, acl_num_ipv4, 0, i);
+
+			acl_config.acx_ipv6[i] = setup_acl(route_base_ipv6,
+				acl_base_ipv6, route_num_ipv6, acl_num_ipv6, 1, i);
+		}
+	}
+
+	free(route_base_ipv4);
+	free(route_base_ipv6);
+
+#ifdef L3FWDACL_DEBUG
+	acl_config.rule_ipv4 = (struct acl4_rule *)acl_base_ipv4;
+	acl_config.rule_ipv6 = (struct acl6_rule *)acl_base_ipv6;
+#else
+	free(acl_base_ipv4);
+	free(acl_base_ipv6);
+#endif
+
+	return 0;
+}
+
+/***********************end of ACL part******************************/
+
+struct lcore_conf {
+	uint16_t n_rx_queue;
+	struct lcore_rx_queue rx_queue_list[MAX_RX_QUEUE_PER_LCORE];
+	uint16_t tx_queue_id[RTE_MAX_ETHPORTS];
+	struct mbuf_table tx_mbufs[RTE_MAX_ETHPORTS];
+} __rte_cache_aligned;
+
+static struct lcore_conf lcore_conf[RTE_MAX_LCORE];
+
+/* Send burst of packets on an output interface */
+static inline int
+send_burst(struct lcore_conf *qconf, uint16_t n, uint8_t port)
+{
+	struct rte_mbuf **m_table;
+	int ret;
+	uint16_t queueid;
+
+	queueid = qconf->tx_queue_id[port];
+	m_table = (struct rte_mbuf **)qconf->tx_mbufs[port].m_table;
+
+	ret = rte_eth_tx_burst(port, queueid, m_table, n);
+	if (unlikely(ret < n)) {
+		do {
+			rte_pktmbuf_free(m_table[ret]);
+		} while (++ret < n);
+	}
+
+	return 0;
+}
+
+/* Enqueue a single packet, and send burst if queue is filled */
+static inline int
+send_single_packet(struct rte_mbuf *m, uint8_t port)
+{
+	uint32_t lcore_id;
+	uint16_t len;
+	struct lcore_conf *qconf;
+
+	lcore_id = rte_lcore_id();
+
+	qconf = &lcore_conf[lcore_id];
+	len = qconf->tx_mbufs[port].len;
+	qconf->tx_mbufs[port].m_table[len] = m;
+	len++;
+
+	/* enough pkts to be sent */
+	if (unlikely(len == MAX_PKT_BURST)) {
+		send_burst(qconf, MAX_PKT_BURST, port);
+		len = 0;
+	}
+
+	qconf->tx_mbufs[port].len = len;
+	return 0;
+}
+
+#ifdef DO_RFC_1812_CHECKS
+static inline int
+is_valid_ipv4_pkt(struct ipv4_hdr *pkt, uint32_t link_len)
+{
+	/* From http://www.rfc-editor.org/rfc/rfc1812.txt section 5.2.2 */
+	/*
+	 * 1. The packet length reported by the Link Layer must be large
+	 * enough to hold the minimum length legal IP datagram (20 bytes).
+	 */
+	if (link_len < sizeof(struct ipv4_hdr))
+		return -1;
+
+	/* 2. The IP checksum must be correct. */
+	/* this is checked in H/W */
+
+	/*
+	 * 3. The IP version number must be 4. If the version number is not 4
+	 * then the packet may be another version of IP, such as IPng or
+	 * ST-II.
+	 */
+	if (((pkt->version_ihl) >> 4) != 4)
+		return -3;
+	/*
+	 * 4. The IP header length field must be large enough to hold the
+	 * minimum length legal IP datagram (20 bytes = 5 words).
+	 */
+	if ((pkt->version_ihl & 0xf) < 5)
+		return -4;
+
+	/*
+	 * 5. The IP total length field must be large enough to hold the IP
+	 * datagram header, whose length is specified in the IP header length
+	 * field.
+	 */
+	if (rte_cpu_to_be_16(pkt->total_length) < sizeof(struct ipv4_hdr))
+		return -5;
+
+	return 0;
+}
+#endif
+
+/* main processing loop */
+static int
+main_loop(__attribute__((unused)) void *dummy)
+{
+	struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
+	unsigned lcore_id;
+	uint64_t prev_tsc, diff_tsc, cur_tsc;
+	int i, nb_rx;
+	uint8_t portid, queueid;
+	struct lcore_conf *qconf;
+	int socketid;
+	const uint64_t drain_tsc = (rte_get_tsc_hz() + US_PER_S - 1)
+			/ US_PER_S * BURST_TX_DRAIN_US;
+	int scalar = parm_config.scalar;
+
+	prev_tsc = 0;
+
+	lcore_id = rte_lcore_id();
+	qconf = &lcore_conf[lcore_id];
+	socketid = rte_lcore_to_socket_id(lcore_id);
+
+	if (qconf->n_rx_queue == 0) {
+		RTE_LOG(INFO, L3FWD, "lcore %u has nothing to do\n", lcore_id);
+		return 0;
+	}
+
+	RTE_LOG(INFO, L3FWD, "entering main loop on lcore %u\n", lcore_id);
+
+	for (i = 0; i < qconf->n_rx_queue; i++) {
+
+		portid = qconf->rx_queue_list[i].port_id;
+		queueid = qconf->rx_queue_list[i].queue_id;
+		RTE_LOG(INFO, L3FWD, " -- lcoreid=%u portid=%hhu rxqueueid=%hhu\n", lcore_id,
+			portid, queueid);
+	}
+
+	while (1) {
+
+		cur_tsc = rte_rdtsc();
+
+		/*
+		 * TX burst queue drain
+		 */
+		diff_tsc = cur_tsc - prev_tsc;
+		if (unlikely(diff_tsc > drain_tsc)) {
+
+			/*
+			 * This could be optimized (use queueid instead of
+			 * portid), but it is not called so often
+			 */
+			for (portid = 0; portid < RTE_MAX_ETHPORTS; portid++) {
+				if (qconf->tx_mbufs[portid].len == 0)
+					continue;
+				send_burst(&lcore_conf[lcore_id],
+					qconf->tx_mbufs[portid].len,
+					portid);
+				qconf->tx_mbufs[portid].len = 0;
+			}
+
+			prev_tsc = cur_tsc;
+		}
+
+		/*
+		 * Read packet from RX queues
+		 */
+		for (i = 0; i < qconf->n_rx_queue; ++i) {
+
+			portid = qconf->rx_queue_list[i].port_id;
+			queueid = qconf->rx_queue_list[i].queue_id;
+			nb_rx = rte_eth_rx_burst(portid, queueid, pkts_burst, MAX_PKT_BURST);
+
+			if(nb_rx > 0){
+				struct acl_search_t acl_search;
+
+				prepare_acl_parameter(pkts_burst, &acl_search, nb_rx);
+
+				if (acl_search.num_ipv4) {
+					CLASSIFY(acl_config.acx_ipv4[socketid], acl_search.data_ipv4,
+								acl_search.res_ipv4, acl_search.num_ipv4, DEFAULT_MAX_CATEGORIES);
+
+					send_packets(acl_search.m_ipv4, acl_search.res_ipv4, acl_search.num_ipv4);
+				}
+
+				if (acl_search.num_ipv6) {
+					CLASSIFY(acl_config.acx_ipv6[socketid], acl_search.data_ipv6,
+								acl_search.res_ipv6, acl_search.num_ipv6, DEFAULT_MAX_CATEGORIES);
+
+					send_packets(acl_search.m_ipv6, acl_search.res_ipv6, acl_search.num_ipv6);
+				}
+			}
+		}
+	}
+}
+
+static int
+check_lcore_params(void)
+{
+	uint8_t queue, lcore;
+	uint16_t i;
+	int socketid;
+
+	for (i = 0; i < nb_lcore_params; ++i) {
+		queue = lcore_params[i].queue_id;
+		if (queue >= MAX_RX_QUEUE_PER_PORT) {
+			printf("invalid queue number: %hhu\n", queue);
+			return -1;
+		}
+		lcore = lcore_params[i].lcore_id;
+		if (!rte_lcore_is_enabled(lcore)) {
+			printf("error: lcore %hhu is not enabled in lcore mask\n", lcore);
+			return -1;
+		}
+		if ((socketid = rte_lcore_to_socket_id(lcore) != 0) &&
+			(numa_on == 0)) {
+			printf("warning: lcore %hhu is on socket %d with numa off \n",
+				lcore, socketid);
+		}
+	}
+	return 0;
+}
+
+static int
+check_port_config(const unsigned nb_ports)
+{
+	unsigned portid;
+	uint16_t i;
+
+	for (i = 0; i < nb_lcore_params; ++i) {
+		portid = lcore_params[i].port_id;
+
+		if ((enabled_port_mask & (1 << portid)) == 0) {
+			printf("port %u is not enabled in port mask\n", portid);
+			return -1;
+		}
+		if (portid >= nb_ports) {
+			printf("port %u is not present on the board\n", portid);
+			return -1;
+		}
+	}
+	return 0;
+}
+
+static uint8_t
+get_port_n_rx_queues(const uint8_t port)
+{
+	int queue = -1;
+	uint16_t i;
+
+	for (i = 0; i < nb_lcore_params; ++i) {
+		if (lcore_params[i].port_id == port && lcore_params[i].queue_id > queue)
+			queue = lcore_params[i].queue_id;
+	}
+	return (uint8_t)(++queue);
+}
+
+static int
+init_lcore_rx_queues(void)
+{
+	uint16_t i, nb_rx_queue;
+	uint8_t lcore;
+
+	for (i = 0; i < nb_lcore_params; ++i) {
+		lcore = lcore_params[i].lcore_id;
+		nb_rx_queue = lcore_conf[lcore].n_rx_queue;
+		if (nb_rx_queue >= MAX_RX_QUEUE_PER_LCORE) {
+			printf("error: too many queues (%u) for lcore: %u\n",
+				(unsigned)nb_rx_queue + 1, (unsigned)lcore);
+			return -1;
+		} else {
+			lcore_conf[lcore].rx_queue_list[nb_rx_queue].port_id =
+				lcore_params[i].port_id;
+			lcore_conf[lcore].rx_queue_list[nb_rx_queue].queue_id =
+				lcore_params[i].queue_id;
+			lcore_conf[lcore].n_rx_queue++;
+		}
+	}
+	return 0;
+}
+
+/* display usage */
+static void
+print_usage(const char *prgname)
+{
+	printf ("%s [EAL options] -- -p PORTMASK -P"" --"OPTION_RULE_IPV4"=FILE"
+		"  -- "OPTION_RULE_IPV6"=FILE"
+		"  [--"OPTION_CONFIG" (port,queue,lcore)[,(port,queue,lcore]]"
+		"  [--"OPTION_ENBJMO" [--max-pkt-len PKTLEN]]\n"
+		"  -p PORTMASK: hexadecimal bitmask of ports to configure\n"
+		"  -P : enable promiscuous mode\n"
+		"  --"OPTION_CONFIG": (port,queue,lcore): rx queues configuration\n"
+		"  --"OPTION_NONUMA": optional, disable numa awareness\n"
+		"  --"OPTION_ENBJMO": enable jumbo frame"
+		" which max packet len is PKTLEN in decimal (64-9600)\n"
+		"  --"OPTION_RULE_IPV4"=FILE: specify the ipv4 rules entries file. Each rule"
+		" occupy one line. 2 kinds of rules are supported. One is ACL entry"
+		" at while line leads with character '%c', another is"
+		" route entry at while line leads with character '%c'.\n"
+		"  --"OPTION_RULE_IPV6"=FILE: specify the ipv6 rules entries file.\n"
+		"  --"OPTION_SCALAR": Use scalar function to do lookup\n",
+		prgname, ACL_LEAD_CHAR, ROUTE_LEAD_CHAR);
+}
+
+static int
+parse_max_pkt_len(const char *pktlen)
+{
+	char *end = NULL;
+	unsigned long len;
+
+	/* parse decimal string */
+	len = strtoul(pktlen, &end, 10);
+	if ((pktlen[0] == '\0') || (end == NULL) || (*end != '\0'))
+		return -1;
+
+	if (len == 0)
+		return -1;
+
+	return len;
+}
+
+static int
+parse_portmask(const char *portmask)
+{
+	char *end = NULL;
+	unsigned long pm;
+
+	/* parse hexadecimal string */
+	pm = strtoul(portmask, &end, 16);
+	if ((portmask[0] == '\0') || (end == NULL) || (*end != '\0'))
+		return -1;
+
+	if (pm == 0)
+		return -1;
+
+	return pm;
+}
+
+static int
+parse_config(const char *q_arg)
+{
+	char s[256];
+	const char *p, *p0 = q_arg;
+	char *end;
+	enum fieldnames {
+		FLD_PORT = 0,
+		FLD_QUEUE,
+		FLD_LCORE,
+		_NUM_FLD
+	};
+	unsigned long int_fld[_NUM_FLD];
+	char *str_fld[_NUM_FLD];
+	int i;
+	unsigned size;
+
+	nb_lcore_params = 0;
+
+	while ((p = strchr(p0,'(')) != NULL) {
+		++p;
+		if((p0 = strchr(p,')')) == NULL)
+			return -1;
+
+		size = p0 - p;
+		if(size >= sizeof(s))
+			return -1;
+
+		rte_snprintf(s, sizeof(s), "%.*s", size, p);
+		if (rte_strsplit(s, sizeof(s), str_fld, _NUM_FLD, ',') != _NUM_FLD)
+			return -1;
+		for (i = 0; i < _NUM_FLD; i++){
+			errno = 0;
+			int_fld[i] = strtoul(str_fld[i], &end, 0);
+			if (errno != 0 || end == str_fld[i] || int_fld[i] > 255)
+				return -1;
+		}
+		if (nb_lcore_params >= MAX_LCORE_PARAMS) {
+			printf("exceeded max number of lcore params: %hu\n",
+				nb_lcore_params);
+			return -1;
+		}
+		lcore_params_array[nb_lcore_params].port_id = (uint8_t)int_fld[FLD_PORT];
+		lcore_params_array[nb_lcore_params].queue_id = (uint8_t)int_fld[FLD_QUEUE];
+		lcore_params_array[nb_lcore_params].lcore_id = (uint8_t)int_fld[FLD_LCORE];
+		++nb_lcore_params;
+	}
+	lcore_params = lcore_params_array;
+	return 0;
+}
+
+/* Parse the argument given in the command line of the application */
+static int
+parse_args(int argc, char **argv)
+{
+	int opt, ret;
+	char **argvopt;
+	int option_index;
+	char *prgname = argv[0];
+	static struct option lgopts[] = {
+		{OPTION_CONFIG, 1, 0, 0},
+		{OPTION_NONUMA, 0, 0, 0},
+		{OPTION_ENBJMO, 0, 0, 0},
+		{OPTION_RULE_IPV4, 1, 0, 0},
+		{OPTION_RULE_IPV6, 1, 0, 0},
+		{OPTION_SCALAR, 0, 0, 0},
+		{NULL, 0, 0, 0}
+	};
+
+	argvopt = argv;
+
+	while ((opt = getopt_long(argc, argvopt, "p:P",
+				lgopts, &option_index)) != EOF) {
+
+		switch (opt) {
+		/* portmask */
+		case 'p':
+			enabled_port_mask = parse_portmask(optarg);
+			if (enabled_port_mask == 0) {
+				printf("invalid portmask\n");
+				print_usage(prgname);
+				return -1;
+			}
+			break;
+		case 'P':
+			printf("Promiscuous mode selected\n");
+			promiscuous_on = 1;
+			break;
+
+		/* long options */
+		case 0:
+			if (!strncmp(lgopts[option_index].name,
+						OPTION_CONFIG, sizeof(OPTION_CONFIG))) {
+				ret = parse_config(optarg);
+				if (ret) {
+					printf("invalid config\n");
+					print_usage(prgname);
+					return -1;
+				}
+			}
+
+			if (!strncmp(lgopts[option_index].name,
+						OPTION_NONUMA, sizeof(OPTION_NONUMA))) {
+				printf("numa is disabled \n");
+				numa_on = 0;
+			}
+
+			if (!strncmp(lgopts[option_index].name,
+						OPTION_ENBJMO, sizeof(OPTION_ENBJMO))) {
+				struct option lenopts = {"max-pkt-len", required_argument, 0, 0};
+
+				printf("jumbo frame is enabled \n");
+				port_conf.rxmode.jumbo_frame = 1;
+
+				/* if no max-pkt-len set, use the default value ETHER_MAX_LEN */
+				if (0 == getopt_long(argc, argvopt, "", &lenopts, &option_index)) {
+					ret = parse_max_pkt_len(optarg);
+					if ((ret < 64) || (ret > MAX_JUMBO_PKT_LEN)){
+						printf("invalid packet length\n");
+						print_usage(prgname);
+						return -1;
+					}
+					port_conf.rxmode.max_rx_pkt_len = ret;
+				}
+				printf("set jumbo frame max packet length to %u\n",
+						(unsigned int)port_conf.rxmode.max_rx_pkt_len);
+			}
+
+			if (!strncmp(lgopts[option_index].name,
+						OPTION_RULE_IPV4, sizeof(OPTION_RULE_IPV4)))
+				parm_config.rule_ipv4_name = optarg;
+
+			if (!strncmp(lgopts[option_index].name,
+						OPTION_RULE_IPV6, sizeof(OPTION_RULE_IPV6))){
+				parm_config.rule_ipv6_name = optarg;
+			}
+
+			if (!strncmp(lgopts[option_index].name,
+						OPTION_SCALAR, sizeof(OPTION_SCALAR)))
+				parm_config.scalar= 1;
+
+
+			break;
+
+		default:
+			print_usage(prgname);
+			return -1;
+		}
+	}
+
+	if (optind >= 0)
+		argv[optind-1] = prgname;
+
+	ret = optind-1;
+	optind = 0; /* reset getopt lib */
+	return ret;
+}
+
+static void
+print_ethaddr(const char *name, const struct ether_addr *eth_addr)
+{
+	printf ("%s%02X:%02X:%02X:%02X:%02X:%02X", name,
+		eth_addr->addr_bytes[0],
+		eth_addr->addr_bytes[1],
+		eth_addr->addr_bytes[2],
+		eth_addr->addr_bytes[3],
+		eth_addr->addr_bytes[4],
+		eth_addr->addr_bytes[5]);
+}
+
+static int
+init_mem(unsigned nb_mbuf)
+{
+	int socketid;
+	unsigned lcore_id;
+	char s[64];
+
+	for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
+		if (rte_lcore_is_enabled(lcore_id) == 0)
+			continue;
+
+		if (numa_on)
+			socketid = rte_lcore_to_socket_id(lcore_id);
+		else
+			socketid = 0;
+
+		if (socketid >= NB_SOCKETS) {
+			rte_exit(EXIT_FAILURE, "Socket %d of lcore %u is out of range %d\n",
+				socketid, lcore_id, NB_SOCKETS);
+		}
+		if (pktmbuf_pool[socketid] == NULL) {
+			rte_snprintf(s, sizeof(s), "mbuf_pool_%d", socketid);
+			pktmbuf_pool[socketid] =
+				rte_mempool_create(s, nb_mbuf, MBUF_SIZE, MEMPOOL_CACHE_SIZE,
+					sizeof(struct rte_pktmbuf_pool_private),
+					rte_pktmbuf_pool_init, NULL,
+					rte_pktmbuf_init, NULL,
+					socketid, 0);
+			if (pktmbuf_pool[socketid] == NULL)
+				rte_exit(EXIT_FAILURE,
+						"Cannot init mbuf pool on socket %d\n", socketid);
+			else
+				printf("Allocated mbuf pool on socket %d\n", socketid);
+		}
+	}
+	return 0;
+}
+
+/* Check the link status of all ports in up to 9s, and print them finally */
+static void
+check_all_ports_link_status(uint8_t port_num, uint32_t port_mask)
+{
+#define CHECK_INTERVAL 100 /* 100ms */
+#define MAX_CHECK_TIME 90 /* 9s (90 * 100ms) in total */
+	uint8_t portid, count, all_ports_up, print_flag = 0;
+	struct rte_eth_link link;
+
+	printf("\nChecking link status");
+	fflush(stdout);
+	for (count = 0; count <= MAX_CHECK_TIME; count++) {
+		all_ports_up = 1;
+		for (portid = 0; portid < port_num; portid++) {
+			if ((port_mask & (1 << portid)) == 0)
+				continue;
+			memset(&link, 0, sizeof(link));
+			rte_eth_link_get_nowait(portid, &link);
+			/* print link status if flag set */
+			if (print_flag == 1) {
+				if (link.link_status)
+					printf("Port %d Link Up - speed %u "
+						"Mbps - %s\n", (uint8_t)portid,
+						(unsigned)link.link_speed,
+				(link.link_duplex == ETH_LINK_FULL_DUPLEX) ?
+					("full-duplex") : ("half-duplex\n"));
+				else
+					printf("Port %d Link Down\n",
+						(uint8_t)portid);
+				continue;
+			}
+			/* clear all_ports_up flag if any link down */
+			if (link.link_status == 0) {
+				all_ports_up = 0;
+				break;
+			}
+		}
+		/* after finally printing all link status, get out */
+		if (print_flag == 1)
+			break;
+
+		if (all_ports_up == 0) {
+			printf(".");
+			fflush(stdout);
+			rte_delay_ms(CHECK_INTERVAL);
+		}
+
+		/* set the print_flag if all ports up or timeout */
+		if (all_ports_up == 1 || count == (MAX_CHECK_TIME - 1)) {
+			print_flag = 1;
+			printf("done\n");
+		}
+	}
+}
+
+int
+MAIN(int argc, char **argv)
+{
+	struct lcore_conf *qconf;
+	int ret;
+	unsigned nb_ports;
+	uint16_t queueid;
+	unsigned lcore_id;
+	uint32_t n_tx_queue, nb_lcores;
+	uint8_t portid, nb_rx_queue, queue, socketid;
+
+	/* init EAL */
+	ret = rte_eal_init(argc, argv);
+	if (ret < 0)
+		rte_exit(EXIT_FAILURE, "Invalid EAL parameters\n");
+	argc -= ret;
+	argv += ret;
+
+	/* parse application arguments (after the EAL ones) */
+	ret = parse_args(argc, argv);
+	if (ret < 0)
+		rte_exit(EXIT_FAILURE, "Invalid L3FWD parameters\n");
+
+	if (check_lcore_params() < 0)
+		rte_exit(EXIT_FAILURE, "check_lcore_params failed\n");
+
+	ret = init_lcore_rx_queues();
+	if (ret < 0)
+		rte_exit(EXIT_FAILURE, "init_lcore_rx_queues failed\n");
+
+	if (rte_eal_pci_probe() < 0)
+		rte_exit(EXIT_FAILURE, "Cannot probe PCI\n");
+
+	nb_ports = rte_eth_dev_count();
+	if (nb_ports > RTE_MAX_ETHPORTS)
+		nb_ports = RTE_MAX_ETHPORTS;
+
+	if (check_port_config(nb_ports) < 0)
+		rte_exit(EXIT_FAILURE, "check_port_config failed\n");
+
+	/* Add ACL rules and route entries, build trie */
+	if (app_acl_init() < 0)
+		rte_exit(EXIT_FAILURE, "app_acl_init failed\n");
+
+	nb_lcores = rte_lcore_count();
+
+	/* initialize all ports */
+	for (portid = 0; portid < nb_ports; portid++) {
+		/* skip ports that are not enabled */
+		if ((enabled_port_mask & (1 << portid)) == 0) {
+			printf("\nSkipping disabled port %d\n", portid);
+			continue;
+		}
+
+		/* init port */
+		printf("Initializing port %d ... ", portid );
+		fflush(stdout);
+
+		nb_rx_queue = get_port_n_rx_queues(portid);
+		n_tx_queue = nb_lcores;
+		if (n_tx_queue > MAX_TX_QUEUE_PER_PORT)
+			n_tx_queue = MAX_TX_QUEUE_PER_PORT;
+		printf("Creating queues: nb_rxq=%d nb_txq=%u... ",
+			nb_rx_queue, (unsigned)n_tx_queue );
+		ret = rte_eth_dev_configure(portid, nb_rx_queue,
+					(uint16_t)n_tx_queue, &port_conf);
+		if (ret < 0)
+			rte_exit(EXIT_FAILURE, "Cannot configure device: err=%d, port=%d\n",
+				ret, portid);
+
+		rte_eth_macaddr_get(portid, &ports_eth_addr[portid]);
+		print_ethaddr(" Address:", &ports_eth_addr[portid]);
+		printf(", ");
+
+		/* init memory */
+		ret = init_mem(NB_MBUF);
+		if (ret < 0)
+			rte_exit(EXIT_FAILURE, "init_mem failed\n");
+
+		/* init one TX queue per couple (lcore,port) */
+		queueid = 0;
+		for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
+			if (rte_lcore_is_enabled(lcore_id) == 0)
+				continue;
+
+			if (numa_on)
+				socketid = (uint8_t)rte_lcore_to_socket_id(lcore_id);
+			else
+				socketid = 0;
+
+			printf("txq=%u,%d,%d ", lcore_id, queueid, socketid);
+			fflush(stdout);
+			ret = rte_eth_tx_queue_setup(portid, queueid, nb_txd,
+						     socketid, &tx_conf);
+			if (ret < 0)
+				rte_exit(EXIT_FAILURE, "rte_eth_tx_queue_setup: err=%d, "
+					"port=%d\n", ret, portid);
+
+			qconf = &lcore_conf[lcore_id];
+			qconf->tx_queue_id[portid] = queueid;
+			queueid++;
+		}
+		printf("\n");
+	}
+
+	for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
+		if (rte_lcore_is_enabled(lcore_id) == 0)
+			continue;
+		qconf = &lcore_conf[lcore_id];
+		printf("\nInitializing rx queues on lcore %u ... ", lcore_id );
+		fflush(stdout);
+		/* init RX queues */
+		for(queue = 0; queue < qconf->n_rx_queue; ++queue) {
+			portid = qconf->rx_queue_list[queue].port_id;
+			queueid = qconf->rx_queue_list[queue].queue_id;
+
+			if (numa_on)
+				socketid = (uint8_t)rte_lcore_to_socket_id(lcore_id);
+			else
+				socketid = 0;
+
+			printf("rxq=%d,%d,%d ", portid, queueid, socketid);
+			fflush(stdout);
+
+			ret = rte_eth_rx_queue_setup(portid, queueid, nb_rxd,
+				        socketid, &rx_conf, pktmbuf_pool[socketid]);
+			if (ret < 0)
+				rte_exit(EXIT_FAILURE, "rte_eth_rx_queue_setup: err=%d,"
+						"port=%d\n", ret, portid);
+		}
+	}
+
+	printf("\n");
+
+	/* start ports */
+	for (portid = 0; portid < nb_ports; portid++) {
+		if ((enabled_port_mask & (1 << portid)) == 0) {
+			continue;
+		}
+		/* Start device */
+		ret = rte_eth_dev_start(portid);
+		if (ret < 0)
+			rte_exit(EXIT_FAILURE, "rte_eth_dev_start: err=%d, port=%d\n",
+				ret, portid);
+
+		/*
+		 * If enabled, put device in promiscuous mode.
+		 * This allows IO forwarding mode to forward packets
+		 * to itself through 2 cross-connected  ports of the
+		 * target machine.
+		 */
+		if (promiscuous_on)
+			rte_eth_promiscuous_enable(portid);
+	}
+
+	check_all_ports_link_status((uint8_t)nb_ports, enabled_port_mask);
+
+	/* launch per-lcore init on every lcore */
+	rte_eal_mp_remote_launch(main_loop, NULL, CALL_MASTER);
+	RTE_LCORE_FOREACH_SLAVE(lcore_id) {
+		if (rte_eal_wait_lcore(lcore_id) < 0)
+			return -1;
+	}
+
+	return 0;
+}
diff --git a/examples/l3fwd-acl/main.h b/examples/l3fwd-acl/main.h
new file mode 100644
index 0000000..f54938b
--- /dev/null
+++ b/examples/l3fwd-acl/main.h
@@ -0,0 +1,45 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _MAIN_H_
+#define _MAIN_H_
+
+#ifdef RTE_EXEC_ENV_BAREMETAL
+#define MAIN _main
+#else
+#define MAIN main
+#endif
+
+int MAIN(int argc, char **argv);
+
+#endif /* _MAIN_H_ */
-- 
1.7.7.6

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [dpdk-dev] [PATCHv2 5/5] acl: add doxygen configuration and start page
  2014-05-28 19:26 [dpdk-dev] [PATCHv2 0/5] ACL library Konstantin Ananyev
                   ` (3 preceding siblings ...)
  2014-05-28 19:26 ` [dpdk-dev] [PATCHv2 4/5] acl: New sample l3fwd-acl Konstantin Ananyev
@ 2014-05-28 19:26 ` Konstantin Ananyev
  2014-06-06  5:54 ` [dpdk-dev] [PATCHv2 0/5] ACL library Cao, Waterman
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Konstantin Ananyev @ 2014-05-28 19:26 UTC (permalink / raw)
  To: dev, dev

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 doc/doxy-api-index.md |    3 ++-
 doc/doxy-api.conf     |    3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/doc/doxy-api-index.md b/doc/doxy-api-index.md
index 2825c08..5e4cea9 100644
--- a/doc/doxy-api-index.md
+++ b/doc/doxy-api-index.md
@@ -78,7 +78,8 @@ There are many libraries, so their headers may be grouped by topics:
   [SCTP]               (@ref rte_sctp.h),
   [TCP]                (@ref rte_tcp.h),
   [UDP]                (@ref rte_udp.h),
-  [LPM route]          (@ref rte_lpm.h)
+  [LPM route]          (@ref rte_lpm.h),
+  [ACL]                (@ref rte_acl.h)
 
 - **QoS**:
   [metering]           (@ref rte_meter.h),
diff --git a/doc/doxy-api.conf b/doc/doxy-api.conf
index 642f77a..b1fc16a 100644
--- a/doc/doxy-api.conf
+++ b/doc/doxy-api.conf
@@ -44,7 +44,8 @@ INPUT                   = doc/doxy-api-index.md \
                           lib/librte_power \
                           lib/librte_ring \
                           lib/librte_sched \
-                          lib/librte_timer
+                          lib/librte_timer \
+                          lib/librte_acl
 FILE_PATTERNS           = rte_*.h \
                           cmdline.h
 PREDEFINED              = __DOXYGEN__ \
-- 
1.7.7.6

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [dpdk-dev] [PATCHv2 0/5] ACL library
  2014-05-28 19:26 [dpdk-dev] [PATCHv2 0/5] ACL library Konstantin Ananyev
                   ` (4 preceding siblings ...)
  2014-05-28 19:26 ` [dpdk-dev] [PATCHv2 5/5] acl: add doxygen configuration and start page Konstantin Ananyev
@ 2014-06-06  5:54 ` Cao, Waterman
  2014-06-06  8:32 ` De Lara Guarch, Pablo
  2014-06-11 22:01 ` Thomas Monjalon
  7 siblings, 0 replies; 9+ messages in thread
From: Cao, Waterman @ 2014-06-06  5:54 UTC (permalink / raw)
  To: Ananyev, Konstantin, dev, Thomas Monjalon

Tested-by: Waterman Cao <waterman.cao@intel.com>

This patch has been tested by Intel.
Basic functional test has been performed about 'l3fwd-acl', all cases are passed.
It includes test_l3fwdacl_acl_rule, test_l3fwdacl_exact_route, test_l3fwdacl_lpm_route, test_l3fwdAcl_Scalar and test_l3fwdacl_invalid.
There is one known issue about 'acl rule problem with protocol', Konstantin will apply one minor patch to fix it later.
Test Environment:
Fedora 20 x86_64, Linux Kernel 3.11.10-301, GCC 4.8.2 Intel Xeon CPU E5-2680 v2 @ 2.80GHz

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [dpdk-dev] [PATCHv2 0/5] ACL library
  2014-05-28 19:26 [dpdk-dev] [PATCHv2 0/5] ACL library Konstantin Ananyev
                   ` (5 preceding siblings ...)
  2014-06-06  5:54 ` [dpdk-dev] [PATCHv2 0/5] ACL library Cao, Waterman
@ 2014-06-06  8:32 ` De Lara Guarch, Pablo
  2014-06-11 22:01 ` Thomas Monjalon
  7 siblings, 0 replies; 9+ messages in thread
From: De Lara Guarch, Pablo @ 2014-06-06  8:32 UTC (permalink / raw)
  To: Ananyev, Konstantin, dev

Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Konstantin Ananyev
> Sent: Wednesday, May 28, 2014 8:27 PM
> To: dev@dpdk.org; dev@dpdk.org
> Subject: [dpdk-dev] [PATCHv2 0/5] ACL library
> 
> The ACL library is used to perform an N-tuple search over a set of rules
> with multiple categories and find the best match (highest priority)
> for each category.
> This code was previously released under a proprietary license,
> but is now being released under a BSD license to allow its
> integration with the rest of the Intel DPDK codebase.
> 
> Note that these patch series require other patch:
> "lpm: Introduce rte_lpm_lookupx4" be already installed.
> 
> This patch series contains the following items:
> 1) librte_acl.
> 2) UT changes reflect latest changes in rte_acl library.
> 3) teat-acl: usage example and main test application for the ACL library.
>    Provides IPv4/IPv6 5-tuple classification.
> 4) l3fwd-acl: demonstrates the use of the ACL library in the DPDK application
>    to implement packet classification and L3 forwarding.
> 5) add doxygen configuration and start page
> 
> v2 fixes:
> * Fixed several checkpatch.pl issues
> * Added doxygen related changes
> 
>  app/Makefile                         |    1 +
>  app/test-acl/Makefile                |   45 +
>  app/test-acl/main.c                  | 1029 +++++++++++++++++
>  app/test-acl/main.h                  |   50 +
>  app/test/test_acl.c                  |  128 ++-
>  config/common_linuxapp               |    6 +
>  doc/doxy-api-index.md                |    3 +-
>  doc/doxy-api.conf                    |    3 +-
>  examples/Makefile                    |    1 +
>  examples/l3fwd-acl/Makefile          |   56 +
>  examples/l3fwd-acl/main.c            | 2048
> ++++++++++++++++++++++++++++++++++
>  examples/l3fwd-acl/main.h            |   45 +
>  lib/librte_acl/Makefile              |   60 +
>  lib/librte_acl/acl.h                 |  182 +++
>  lib/librte_acl/acl_bld.c             | 2001 +++++++++++++++++++++++++++++++++
>  lib/librte_acl/acl_gen.c             |  473 ++++++++
>  lib/librte_acl/acl_run.c             |  927 +++++++++++++++
>  lib/librte_acl/acl_vect.h            |  129 +++
>  lib/librte_acl/rte_acl.c             |  413 +++++++
>  lib/librte_acl/rte_acl.h             |  453 ++++++++
>  lib/librte_acl/rte_acl_osdep.h       |   92 ++
>  lib/librte_acl/rte_acl_osdep_alone.h |  277 +++++
>  lib/librte_acl/tb_mem.c              |  102 ++
>  lib/librte_acl/tb_mem.h              |   73 ++
>  24 files changed, 8552 insertions(+), 45 deletions(-)
>  create mode 100644 app/test-acl/Makefile
>  create mode 100644 app/test-acl/main.c
>  create mode 100644 app/test-acl/main.h
>  create mode 100644 examples/l3fwd-acl/Makefile
>  create mode 100644 examples/l3fwd-acl/main.c
>  create mode 100644 examples/l3fwd-acl/main.h
>  create mode 100644 lib/librte_acl/Makefile
>  create mode 100644 lib/librte_acl/acl.h
>  create mode 100644 lib/librte_acl/acl_bld.c
>  create mode 100644 lib/librte_acl/acl_gen.c
>  create mode 100644 lib/librte_acl/acl_run.c
>  create mode 100644 lib/librte_acl/acl_vect.h
>  create mode 100644 lib/librte_acl/rte_acl.c
>  create mode 100644 lib/librte_acl/rte_acl.h
>  create mode 100644 lib/librte_acl/rte_acl_osdep.h
>  create mode 100644 lib/librte_acl/rte_acl_osdep_alone.h
>  create mode 100644 lib/librte_acl/tb_mem.c
>  create mode 100644 lib/librte_acl/tb_mem.h
> 
> --
> 1.7.7.6

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [dpdk-dev] [PATCHv2 0/5] ACL library
  2014-05-28 19:26 [dpdk-dev] [PATCHv2 0/5] ACL library Konstantin Ananyev
                   ` (6 preceding siblings ...)
  2014-06-06  8:32 ` De Lara Guarch, Pablo
@ 2014-06-11 22:01 ` Thomas Monjalon
  7 siblings, 0 replies; 9+ messages in thread
From: Thomas Monjalon @ 2014-06-11 22:01 UTC (permalink / raw)
  To: Konstantin Ananyev; +Cc: dev

Hi Konstantin,

2014-05-28 20:26, Konstantin Ananyev:
> v2 fixes:
> * Fixed several checkpatch.pl issues

It seems that many checkpatch issues are remaining.
Could you re-check please?

Thanks
-- 
Thomas

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2014-06-11 22:01 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-28 19:26 [dpdk-dev] [PATCHv2 0/5] ACL library Konstantin Ananyev
2014-05-28 19:26 ` [dpdk-dev] [PATCHv2 1/5] acl: Add ACL library (librte_acl) into DPDK Konstantin Ananyev
2014-05-28 19:26 ` [dpdk-dev] [PATCHv2 2/5] acl: update UT to reflect latest changes in the librte_acl Konstantin Ananyev
2014-05-28 19:26 ` [dpdk-dev] [PATCHv2 3/5] acl: New test-acl application Konstantin Ananyev
2014-05-28 19:26 ` [dpdk-dev] [PATCHv2 4/5] acl: New sample l3fwd-acl Konstantin Ananyev
2014-05-28 19:26 ` [dpdk-dev] [PATCHv2 5/5] acl: add doxygen configuration and start page Konstantin Ananyev
2014-06-06  5:54 ` [dpdk-dev] [PATCHv2 0/5] ACL library Cao, Waterman
2014-06-06  8:32 ` De Lara Guarch, Pablo
2014-06-11 22:01 ` Thomas Monjalon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).