From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.mhcomputing.net (master.mhcomputing.net [74.208.228.170]) by dpdk.org (Postfix) with ESMTP id 06BE25913 for ; Wed, 2 Dec 2015 16:47:42 +0100 (CET) Received: by mail.mhcomputing.net (Postfix, from userid 1000) id 6F6CC324; Wed, 2 Dec 2015 10:47:41 -0500 (EST) Date: Wed, 2 Dec 2015 10:47:41 -0500 From: Matthew Hall To: Bruce Richardson Message-ID: <20151202154741.GA17618@mhcomputing.net> References: <26FA93C7ED1EAA44AB77D62FBE1D27BA674705F1@IRSMSX108.ger.corp.intel.com> <20151201125935.GA20658@mhcomputing.net> <26FA93C7ED1EAA44AB77D62FBE1D27BA67470C36@IRSMSX108.ger.corp.intel.com> <20151201134457.GB21396@mhcomputing.net> <20151201135739.GA31804@bricha3-MOBL3> <20151201194946.GC28164@mhcomputing.net> <20151202123516.GA30204@bricha3-MOBL3> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20151202123516.GA30204@bricha3-MOBL3> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: "dev@dpdk.org" Subject: Re: [dpdk-dev] 2.3 Roadmap X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Dec 2015 15:47:42 -0000 On Wed, Dec 02, 2015 at 12:35:16PM +0000, Bruce Richardson wrote: > Hi Matthew, > > thanks for the info, but I'm not sure I understand it correctly. It seems to > me that you are mostly referring to the depths/sizes of the tables being used, > rather than to the "data-size" being stored in each entry, which was actually > what I was asking about. Is that correct? If so, it seems that - looking initially > at IPv4 LPM only - you are more looking for an increase in the number of tbl8's > for lookup, rather than necessarily an increase the 8-bit user data being stored > with each entry. [And assuming similar interest for v6] Am I right in > thinking this? > > Thanks, > /Bruce This question is a result of a different way of looking at things between routing / networking and security. I actually need to increase the size of user data as I did in my patches. 1. There is an assumption, when LPM is used for routing, that many millions of inputs might map to a smaller number of outputs. 2. This assumption is not true in the security ecosystem. If I have several million CIDR blocks and bad IPs, I need a separate user data value output for each value input. This is because, every time I have a bad IP, CIDR, Domain, URL, or Email, I create a security indicator tracking struct for each one of these. In the IP and CIDR case I find the struct using rte_hash (possibly for single IPs) and rte_lpm. For Domain, URL, and Email, rte_hash cannot be used, because it mis-assumes all inputs are equal-length. So I had to use a different hash table. 4. The struct contains things such as a unique 64-bit unsigned integer for each separate IP or CIDR triggered, to allow looking up contextual data about the threat it represents. These IDs are defined by upstream threat databases, so I can't crunch them down to fit inside rte_lpm. They also include stats regarding how many times an indicator is seen, what kind of security threat it represents, etc. Without which you can't do any valuable security enrichment needed to respond to any events generated. 5. This means, if I want to support X million security indicators, regardless if they are IP, CIDR, Domain, URL, or Email, then I need X million distinct user data values to look up all the context that goes with them. Matthew.