From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from Perses.usurf.usu.edu (unknown [129.123.197.145]) by dpdk.org (Postfix) with ESMTP id 39DC3152A for ; Fri, 6 Jan 2017 23:51:57 +0100 (CET) Received: from EK.usurf.usu.edu (172.31.35.73) by Perses.usurf.usu.edu (172.31.35.68) with Microsoft SMTP Server (TLS) id 15.0.1210.3; Fri, 6 Jan 2017 15:51:56 -0700 Received: from EK.usurf.usu.edu (172.31.35.73) by Ek.usurf.usu.edu (172.31.35.73) with Microsoft SMTP Server (TLS) id 15.0.1210.3; Fri, 6 Jan 2017 15:51:56 -0700 Received: from EK.usurf.usu.edu ([172.31.35.73]) by Ek.usurf.usu.edu ([172.31.35.73]) with mapi id 15.00.1210.000; Fri, 6 Jan 2017 15:51:56 -0700 From: Nate Jensen To: "users@dpdk.org" Thread-Topic: Understanding ACL Implementation Thread-Index: AdJoaL8mRn7bU0sfRNqxtB8dvl5tVgABqxWg Date: Fri, 6 Jan 2017 22:51:55 +0000 Message-ID: <058c7292427a461e9cd5ea03ec3b686b@Ek.usurf.usu.edu> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [172.31.24.23] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: [dpdk-users] Understanding ACL Implementation X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Jan 2017 22:51:57 -0000 Hello DPDK Users: I have some in-depth/rigorous questions about DPDK's ACL matching capabilit= ies. To answer these questions, I have been scouring the web, reading much= documentation, and pouring over the source code to the best of my ability.= As I dig through the source, I am struggling to understand all of nuances= of the implementation and was hoping someone might have a bit higher level= discussion about how the ACL matching algorithm behaves. If this doesn't = exist, then I will continue to plow through the source, but it isn't a fast= process. Currently, my organization has a product that does ACL matching by using a = TCAM. Because of this, I can make hard guarantees about our systems perfor= mance. A TCAM has some desirable traits: 1. Line rate matching of 64K frames=20 2. No performance degradation as rule set increases in size, (until the TCA= M is exhausted of course) 3. No restrictions on rule set, (i.e. no rule set behaves better or worse t= han another. They all work at line rate) 4. Adding or removing a rule is fast 5. No preprocessing or "compilation" of the rule set is required While the TCAM is nice, I am interested in pushing our product away from th= e specialized hardware and replace the TCAM with DPDK. However, I have con= cerns about the ACL matching capability provided by DPDK. I would like to = make some assertions about what rule sets perform better than others. Spec= ifically, I would like to know: 1. What types of rules increase memory use? 2. What types of rules lead to decreased performance? 3. Has benchmarking been done to find pathological worst case rule sets? = What do these rule sets look like? Personally, I have built several "home grown" ACL matching algorithms on to= p of DPDK. I have experimented with many variants of trie trees, cross pro= ducting all permutations of fields/dimensions, and many others. None of my= implementations match the observed capability of the DPDK provided ACL imp= lementation. I have even generated, (via script), random ACL rulesets that= can be quite large (10K rules). When loaded into the l3fwd-acl example pr= ogram, performance is still quite strong. Rule sets that were a pathologic= al worst case for my implementations were generally handled gracefully by D= PDK ACLs. I need to understand this implementation, but I tend to get lost= when working through the source. I understand that the ACL implementation uses tries with an 8bit stride. H= owever, I don't understand how the DPDK implementation efficiently handles = don't cares or ranges that may be found in various search fields. My homeg= rown trie implementation accounted for this in one of two ways: 1. Expand the tree to account for don't cares -- This was effective, but p= athological rule sets caused huge amounts of RAM usage. A don't care bit e= arly on in the rule resulted in many duplicative copies of subtrees under t= he don't care trie paths. 2. Use a "ternary" trie with backtracking -- This is also effective, but s= ince the trie was ternary it may have to backtrack to follow the don't care= path. In the pathological worst case, the traversal essentially had to ex= amine every rule. Performance was abysmal. What I need to understand is the DPDK ACL algorithm and the motivation behi= nd it. It seems that ACLs are generated it two phases. The build phase ev= aluates each rule and then sorts it for "wildness." This is a concept/heur= istic that is currently lost on me. I then see that the algorithm develops= a trie for each rule. It seems that this initial trie is actually a 1-bit= stride trie that may be quite large. The algorithm then moves to the "gen= erate" phase where the internal tries are compacted into something more spa= ce/time efficient. It is unclear to me how the algorithm doesn't suffer fr= om: 1. Memory bloat if runtime performance is favored 2. Matching performance suffering is memory efficiency is favored If someone had any background on the algorithm implemented that could help = me see the forest from the trees I would be grateful. I am also quite will= ing to document what I learn and feed this documentation back to the commun= ity if it is of value. I would also be willing to provide my testing resul= ts showing performance observations from best/average/pathological rule set= s. Best Regards, Nathan Jensen