DPDK patches and discussions
 help / color / mirror / Atom feed
From: Matthew Hall <mhall@mhcomputing.net>
To: Thomas Monjalon <thomas.monjalon@6wind.com>
Cc: dev@dpdk.org
Subject: Re: [dpdk-dev] IPv6 Offload Capabilities
Date: Mon, 5 Jan 2015 21:25:37 -0800	[thread overview]
Message-ID: <20150106052537.GB17455@mhcomputing.net> (raw)
In-Reply-To: <5360787.ystvMoQ9V7@xps13>

On Mon, Jan 05, 2015 at 09:36:54AM +0100, Thomas Monjalon wrote:
> Which checksum are you talking about? IPv6 checsum doesn't exist.

The same computation algorithm must be reused to calculate the IPV6 
Pseudoheader checksum when generating ICMPV6, UDPV6, and other L4 protocols 
whose definitions were retroactively modified to include the IPV6 
pseudoheader, that happen to use the same checksum in L4 which IP used in L3.

> > Then I went looking and DPDK doesn't offer an accelerated user-space routine
> > for it. Which seems like it could work out quite poorly for people trying to
> > use ARM and PPC where the offloads might not be present. I had to steal an
> > unaccelerated one from *BSD just to get things running until I could figure
> > out a better way, which worked right for IPv6 and ICMP datagrams so
> > everything can use 100% the same clean code.
> 
> What are you talking about?

Yeah this is referring to the IP checksum algorithm, "the ones' complement of 
the ones' complement sum of some 16-bit words". I didn't find a speedy version 
of it for manually hacking together IPV6 based frames anyplace inside DPDK.

> Can we have the same performance with extended tables?
> Maybe you just want to implement your own tables.

One thing is for sure. People using DPDK are not going to be Intel 
acceleration experts. If we were we wouldn't need to use DPDK. ;)

Therefore any table that comes with DPDK is definitely going to be using 
better optimizations than whatever we come up with on our own, not to mention 
reinventing the wheel incompatibly is a bad thing, despite that many C 
developers like to do so. ;)

I'm a security expert but I'm not an Intel-friendly hash table expert. It 
would be totally OK if the table didn't run as fast when bigger stuff was 
used, but right now big stuff is just prohibited with a bunch of hard-coded 
sizes and this seems like a bad thing.

> > 2) The checksum operations are kind of a hodgepodge and don't always have a
> > consistent vision to them... some things like the 16-bit-based IP checksum
> > appear to be missing any routine, including any accelerated one when the
> > offload doesn't work (like for ICMPv4, ICMPv6, and any IPv6 datagrams, or
> > other weird crap like IPv6 pseudo headers, even contemplating those gives me
> > a headache, but at least my greenfield code for it works now).
> 
> Please detail which function is missing for which usage.

rte_hash_crc exists, rte_hash_crc_4byte exists, there is no rte_hash_ip_cksum 
to use when checksum offloading doesn't work for some reason (in BSD it's 
called in_cksum). The jhash and CRC API's don't look to be consistent / 
compatible. An expandable API with some enum of hash algorithms and a standard 
calling convention for accelerated / special algorithms (like ones which 
assume 4-byte input) would make this more generic.

> > 3) There isn't a real flexible choice of hash functions for the things which
> > use hashes... for example, something which offered bidirectional programming
> > of the Flow Director hash algo by stock / default (as seen in a paper one of
> > the Intel guys posted recently) would be super awesome.
> 
> Again, a reference to the paper would help.

http://www.ndsl.kaist.edu/~shinae/papers/TR-symRSS.pdf

Mentioned by jim at netgate.com (Jim Thompson) .

To sum up the paper, there is a special way to set up the Flow Director hash, 
which barely changes packet evenness from the default setting, which will get 
both directions of L4 flows routed into the same CPU cores.

But the larger architectural point was my proposed goal that all of the 
various kinds of hashes (flow hashes, checksums / packet hashes, table lookup 
hashes, etc.) could use a consistent pluggable API so we could easily move 
back and forth between them and write clean consistent code any time a hash is 
being used.

Matthew.

  reply	other threads:[~2015-01-06  5:27 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-05  7:56 Gal Sagie
2015-01-05  8:09 ` Matthew Hall
2015-01-05  8:36   ` Thomas Monjalon
2015-01-06  5:25     ` Matthew Hall [this message]
2015-01-06  5:30       ` Matthew Hall
2015-01-14 11:29       ` Thomas Monjalon
2015-01-05  8:33 ` Olivier MATZ

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150106052537.GB17455@mhcomputing.net \
    --to=mhall@mhcomputing.net \
    --cc=dev@dpdk.org \
    --cc=thomas.monjalon@6wind.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).