From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.mhcomputing.net (master.mhcomputing.net [74.208.228.170]) by dpdk.org (Postfix) with ESMTP id 64A7D5A72 for ; Thu, 17 Dec 2015 00:38:26 +0100 (CET) Received: by mail.mhcomputing.net (Postfix, from userid 1000) id 368811A5; Wed, 16 Dec 2015 18:38:24 -0500 (EST) Date: Wed, 16 Dec 2015 18:38:24 -0500 From: Matthew Hall To: Morten B Message-ID: <20151216233824.GA23052@mhcomputing.net> References: <20151214182931.GA17279@mhcomputing.net> <20151214223613.GC21163@mhcomputing.net> <20151216104502.GA10020@bricha3-MOBL3> <98CBD80474FA8B44BF855DF32C47DC358AF76F@smartserver.smartshare.dk> <20151216115611.GB10020@bricha3-MOBL3> <98CBD80474FA8B44BF855DF32C47DC358AF771@smartserver.smartshare.dk> <20151216131249.GC10020@bricha3-MOBL3> <98CBD80474FA8B44BF855DF32C47DC358AF776@smartserver.smartshare.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC358AF776@smartserver.smartshare.dk> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: dev@dpdk.org Subject: Re: [dpdk-dev] tcpdump support in DPDK 2.3 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Dec 2015 23:38:26 -0000 On Wed, Dec 16, 2015 at 11:45:46PM +0100, Morten Brørup wrote: > Matthew presented a very important point a few hours ago: We don't need > tcpdump support for debugging the application in a lab; we already have > plenty of other tools for debugging what we are developing. We need tcpdump > support for debugging network issues in a production network. +1 > In my "hardened network appliance" world, a solution designed purely for > legacy applications (tcpdump, Wireshark etc.) is useless because the network > technician doesn't have access to these applications on the appliance. Maybe that's true on one exact system. But I've used a whole ton of systems including appliances where this was not true. I really do want to find a way to support them, but according to my recent discussions w/ Alex Nasonov who made bpfjit, I don't think it is possible without really tearing apart libpcap. So for now the only good hope is Wireshark's Extcap support. > While a PC system running a DPDK based application might have plenty of > spare lcores for filtering, the SmartShare appliances are already using all > lcores for dedicated purposes, so the runtime filtering has to be done by > the IO lcores (otherwise we would have to rehash everything and reallocate > some lcores for mirroring, which I strongly oppose). Our non-DPDK firmware > has also always been filtering directly in the fast path. The shared process stuff and weird leftover lcore stuff seems way too complex for me whether or not there are any spare lcores. To me it seems easier if I just call some function and hand it mbufs, and it would quickly check them against a linked list of active filters if filters are present, or do nothing and return if no filter is active. > If the filter is so complex that it unexpectedly degrades the normal traffic > forwarding performance If bpfjit is used, I think it is very hard to affect the performance much. Unless you do something incredibly crazy. > Although it is generally considered bad design if a system's behavior (or > performance) changes unexpectedly when debugging features are being used, I think we can keep the behavior change quite small using something like what I described. > Other companies might prefer to keep their fast path performance unaffected > and dedicate/reallocate some lcores for filtering. It always starts out unaffected... then goes back to accepting a bit of slowness when people are forced to re-learn how bad it is with no debugging. I have seen it again and again in many companies. Hence my proposal for efficient lightweight debugging support from the beginning. > 1. BPF filtering (... a DPDK variant of bpfjit), +1 > 2. scalable packet queueing for the mirrored packets (probably multi > producer, single or multi consumer) I hate queueing. Queueing always reduces max possible throughput because queueing is inefficient. It is better just to put them where they need to go immediately (run to completion) while the mbufs are already prefetched. > Then the DPDK application can take care of interfacing to > the attached application and outputting the mirrored packets to the > appropriate destination Too complicated. Pcap and extcap should be working by default. > A note about packet ordering: Mirrored packets belonging to different flows > are probably out of order because of RSS, where multiple lcores contribute > to the mirror output. Where I worry is weird configurations where a flow can occur in >1 cores. But I think most users try not to do this.