From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <mhall@mhcomputing.net>
Received: from mail.mhcomputing.net (master.mhcomputing.net [74.208.46.186])
 by dpdk.org (Postfix) with ESMTP id 4EBC55A1D
 for <dev@dpdk.org>; Mon, 23 Feb 2015 22:17:25 +0100 (CET)
Received: by mail.mhcomputing.net (Postfix, from userid 1000)
 id B4E9380C036; Mon, 23 Feb 2015 13:16:45 -0800 (PST)
Date: Mon, 23 Feb 2015 13:16:45 -0800
From: Matthew Hall <mhall@mhcomputing.net>
To: Matt Laswell <laswell@infiniteio.com>
Message-ID: <20150223211645.GB20766@mhcomputing.net>
References: <3ABAA9DB-3F71-44D4-9C46-22933F9F30F0@mhcomputing.net>
 <20150222160204.20816910@urahara>
 <F543F60F-083D-4018-8387-062EAF8319D1@mhcomputing.net>
 <CA+GnqApB+nEQXD1TssOotXX+sV8DZ5aoDwQnEv9CoUhqwSckFA@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <CA+GnqApB+nEQXD1TssOotXX+sV8DZ5aoDwQnEv9CoUhqwSckFA@mail.gmail.com>
User-Agent: Mutt/1.5.23 (2014-03-12)
Cc: "<dev@dpdk.org>" <dev@dpdk.org>
Subject: Re: [dpdk-dev] Appropriate DPDK data structures for TCP sockets
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Mon, 23 Feb 2015 21:17:25 -0000

On Mon, Feb 23, 2015 at 08:48:57AM -0600, Matt Laswell wrote:
> Apologies in advance for likely being a bit long-winded.

Long winded is great, helps me get context.

> First, you really need to take cache performance into account when you're
> choosing a data structure.  Something like a balanced tree can seem awfully
> appealing at first blush

Agreed. I did some amount of DPDK stuff before but without TCP. This is why I 
was figuring a packet-hash is better than a tree.

> Second, rather than synchronizing (perhaps with locks, perhaps with
> lockless data structures), it's often beneficial to create multiple
> threads, each of which holds a fraction of your connection tracking data.

Yes, I REALLY REALLY REALLY wanted to do RSS. But the virtio-net and other 
VM's don't support RSS, unlike the classic PCIe NIC's. In order to get the 
community to use my app I have to give them a "batteries included" 
environment, where the system can still work even with no RSS.

> Third, it's very worthwhile to have a cache for the most recently accessed
> connection.  First, because network traffic is bursty, and you'll
> frequently see multiple packets from the same connection in succession.
> Second, because it can make life easier for your application code.  If you
> have multiple places that need to access connection data, you don't have to
> worry so much about the cost of repeated searches.  Again, this may or may
> not matter for your particular application.  But for ones I've worked on,
> it's been a win.

Yes, this sounds like a really good idea. One advantage in my product, I am 
only doing TCP Syslog, so I don't have an arbitrary zillion connections like 
FW or IPS would want. I could cap it at something like 8192 or 16384 and be 
good enough for some time until a better solution is worked out.

I could make some capped array or linked list of the X most recent ones for 
cheap access. It's just socket pointers so it doesn't hardly cost anything to 
copy a couple pointers into a cache and quickly invalidate when the connection 
closes.

> Anyway, as predicted, this post has gone far too long for a Monday
> morning.  Regardless, I hope you found it useful.

This was great. Thank you!

Matthew.