From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.mhcomputing.net (master.mhcomputing.net [74.208.46.186]) by dpdk.org (Postfix) with ESMTP id 52300590C for ; Tue, 6 Jan 2015 06:27:52 +0100 (CET) Received: by mail.mhcomputing.net (Postfix, from userid 1000) id B9D4080BD96; Mon, 5 Jan 2015 21:25:37 -0800 (PST) Date: Mon, 5 Jan 2015 21:25:37 -0800 From: Matthew Hall To: Thomas Monjalon Message-ID: <20150106052537.GB17455@mhcomputing.net> References: <5360787.ystvMoQ9V7@xps13> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5360787.ystvMoQ9V7@xps13> User-Agent: Mutt/1.5.23 (2014-03-12) Cc: dev@dpdk.org Subject: Re: [dpdk-dev] IPv6 Offload Capabilities X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Jan 2015 05:27:52 -0000 On Mon, Jan 05, 2015 at 09:36:54AM +0100, Thomas Monjalon wrote: > Which checksum are you talking about? IPv6 checsum doesn't exist. The same computation algorithm must be reused to calculate the IPV6 Pseudoheader checksum when generating ICMPV6, UDPV6, and other L4 protocols whose definitions were retroactively modified to include the IPV6 pseudoheader, that happen to use the same checksum in L4 which IP used in L3. > > Then I went looking and DPDK doesn't offer an accelerated user-space routine > > for it. Which seems like it could work out quite poorly for people trying to > > use ARM and PPC where the offloads might not be present. I had to steal an > > unaccelerated one from *BSD just to get things running until I could figure > > out a better way, which worked right for IPv6 and ICMP datagrams so > > everything can use 100% the same clean code. > > What are you talking about? Yeah this is referring to the IP checksum algorithm, "the ones' complement of the ones' complement sum of some 16-bit words". I didn't find a speedy version of it for manually hacking together IPV6 based frames anyplace inside DPDK. > Can we have the same performance with extended tables? > Maybe you just want to implement your own tables. One thing is for sure. People using DPDK are not going to be Intel acceleration experts. If we were we wouldn't need to use DPDK. ;) Therefore any table that comes with DPDK is definitely going to be using better optimizations than whatever we come up with on our own, not to mention reinventing the wheel incompatibly is a bad thing, despite that many C developers like to do so. ;) I'm a security expert but I'm not an Intel-friendly hash table expert. It would be totally OK if the table didn't run as fast when bigger stuff was used, but right now big stuff is just prohibited with a bunch of hard-coded sizes and this seems like a bad thing. > > 2) The checksum operations are kind of a hodgepodge and don't always have a > > consistent vision to them... some things like the 16-bit-based IP checksum > > appear to be missing any routine, including any accelerated one when the > > offload doesn't work (like for ICMPv4, ICMPv6, and any IPv6 datagrams, or > > other weird crap like IPv6 pseudo headers, even contemplating those gives me > > a headache, but at least my greenfield code for it works now). > > Please detail which function is missing for which usage. rte_hash_crc exists, rte_hash_crc_4byte exists, there is no rte_hash_ip_cksum to use when checksum offloading doesn't work for some reason (in BSD it's called in_cksum). The jhash and CRC API's don't look to be consistent / compatible. An expandable API with some enum of hash algorithms and a standard calling convention for accelerated / special algorithms (like ones which assume 4-byte input) would make this more generic. > > 3) There isn't a real flexible choice of hash functions for the things which > > use hashes... for example, something which offered bidirectional programming > > of the Flow Director hash algo by stock / default (as seen in a paper one of > > the Intel guys posted recently) would be super awesome. > > Again, a reference to the paper would help. http://www.ndsl.kaist.edu/~shinae/papers/TR-symRSS.pdf Mentioned by jim at netgate.com (Jim Thompson) . To sum up the paper, there is a special way to set up the Flow Director hash, which barely changes packet evenness from the default setting, which will get both directions of L4 flows routed into the same CPU cores. But the larger architectural point was my proposed goal that all of the various kinds of hashes (flow hashes, checksums / packet hashes, table lookup hashes, etc.) could use a consistent pluggable API so we could easily move back and forth between them and write clean consistent code any time a hash is being used. Matthew.