DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] IPv6 Offload Capabilities
@ 2015-01-05  7:56 Gal Sagie
  2015-01-05  8:09 ` Matthew Hall
  2015-01-05  8:33 ` Olivier MATZ
  0 siblings, 2 replies; 7+ messages in thread
From: Gal Sagie @ 2015-01-05  7:56 UTC (permalink / raw)
  To: <dev

Hello All,

I noticed that in version 1.8, there are no flags to indicate IPv6 check
sum offloading
(only DEV_TX_OFFLOAD_IPV4_CKSUM)
which means TSO offloading is also not supported for IPv6.

Are there any plans/road map to support this?

Thanks
Gal.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] IPv6 Offload Capabilities
  2015-01-05  7:56 [dpdk-dev] IPv6 Offload Capabilities Gal Sagie
@ 2015-01-05  8:09 ` Matthew Hall
  2015-01-05  8:36   ` Thomas Monjalon
  2015-01-05  8:33 ` Olivier MATZ
  1 sibling, 1 reply; 7+ messages in thread
From: Matthew Hall @ 2015-01-05  8:09 UTC (permalink / raw)
  To: <dev

On Jan 4, 2015, at 11:56 PM, Gal Sagie <gal.sagie@gmail.com> wrote:
> I noticed that in version 1.8, there are no flags to indicate IPv6 check
> sum offloading
> (only DEV_TX_OFFLOAD_IPV4_CKSUM)
> which means TSO offloading is also not supported for IPv6.

I need that feature too. Right now I disabled the IP checksum offloading because I was making some greenfield code which does both protocol versions cleanly, so it's not nice or polite to use real strange asymmetric logic in there.

Then I went looking and DPDK doesn't offer an accelerated user-space routine for it. Which seems like it could work out quite poorly for people trying to use ARM and PPC where the offloads might not be present. I had to steal an unaccelerated one from *BSD just to get things running until I could figure out a better way, which worked right for IPv6 and ICMP datagrams so everything can use 100% the same clean code.

I think a bit more thought is needed around the various crypto / checksum / hash features in DPDK in general for the future versions.

1) The hash table and LPM table have real strict limits about what kinds of keys and values can be used. I have much bigger keys than the usual classic packet keys (which I also need to support) and these won't work in the DPDK's tables. It's a real bummer because I could use these for implementing high speed logging and management protocols where I need to access some funky keys and values at a very high perf rate, not just extremely small ones at line-rate perf rate, as they've got now. It'd also be good if they could work on bigger stuff like L4-L7 security indicators (IPs work, domains, URLs, emails, MD5's, SHA256's, etc. don't normally fit in DPDK's extremely locked down tables).

2) The checksum operations are kind of a hodgepodge and don't always have a consistent vision to them... some things like the 16-bit-based IP checksum appear to be missing any routine, including any accelerated one when the offload doesn't work (like for ICMPv4, ICMPv6, and any IPv6 datagrams, or other weird crap like IPv6 pseudo headers, even contemplating those gives me a headache, but at least my greenfield code for it works now).

3) There isn't a real flexible choice of hash functions for the things which use hashes... for example, something which offered bidirectional programming of the Flow Director hash algo by stock / default (as seen in a paper one of the Intel guys posted recently) would be super awesome.

Matthew.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] IPv6 Offload Capabilities
  2015-01-05  7:56 [dpdk-dev] IPv6 Offload Capabilities Gal Sagie
  2015-01-05  8:09 ` Matthew Hall
@ 2015-01-05  8:33 ` Olivier MATZ
  1 sibling, 0 replies; 7+ messages in thread
From: Olivier MATZ @ 2015-01-05  8:33 UTC (permalink / raw)
  To: Gal Sagie, <dev

Hello,

On 01/05/2015 08:56 AM, Gal Sagie wrote:
> I noticed that in version 1.8, there are no flags to indicate IPv6 check
> sum offloading
> (only DEV_TX_OFFLOAD_IPV4_CKSUM)

There is no L3 checksum field in IPv6 header, that's why there is no
DEV_TX_OFFLOAD_IPV6_CKSUM flag.

> which means TSO offloading is also not supported for IPv6.

TSO is supported for IPv6. Please see the test report sent on the
mailing list:
http://dpdk.org/ml/archives/dev/2014-November/007991.html

Regards,
Olivier

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] IPv6 Offload Capabilities
  2015-01-05  8:09 ` Matthew Hall
@ 2015-01-05  8:36   ` Thomas Monjalon
  2015-01-06  5:25     ` Matthew Hall
  0 siblings, 1 reply; 7+ messages in thread
From: Thomas Monjalon @ 2015-01-05  8:36 UTC (permalink / raw)
  To: Matthew Hall, Gal Sagie; +Cc: dev

Hi Gal and Matthew,

2015-01-05 00:09, Matthew Hall:
> On Jan 4, 2015, at 11:56 PM, Gal Sagie <gal.sagie@gmail.com> wrote:
> > I noticed that in version 1.8, there are no flags to indicate IPv6 check
> > sum offloading
> > (only DEV_TX_OFFLOAD_IPV4_CKSUM)
> > which means TSO offloading is also not supported for IPv6.
> 
> I need that feature too. Right now I disabled the IP checksum offloading
> because I was making some greenfield code which does both protocol versions
> cleanly, so it's not nice or polite to use real strange asymmetric logic in
> there.

Which checksum are you talking about? IPv6 checsum doesn't exist.

> Then I went looking and DPDK doesn't offer an accelerated user-space routine
> for it. Which seems like it could work out quite poorly for people trying to
> use ARM and PPC where the offloads might not be present. I had to steal an
> unaccelerated one from *BSD just to get things running until I could figure
> out a better way, which worked right for IPv6 and ICMP datagrams so
> everything can use 100% the same clean code.

What are you talking about?

> I think a bit more thought is needed around the various crypto / checksum /
> hash features in DPDK in general for the future versions.
> 
> 1) The hash table and LPM table have real strict limits about what kinds of
> keys and values can be used. I have much bigger keys than the usual classic
> packet keys (which I also need to support) and these won't work in the
> DPDK's tables. It's a real bummer because I could use these for implementing
> high speed logging and management protocols where I need to access some
> funky keys and values at a very high perf rate, not just extremely small
> ones at line-rate perf rate, as they've got now. It'd also be good if they
> could work on bigger stuff like L4-L7 security indicators (IPs work,
> domains, URLs, emails, MD5's, SHA256's, etc. don't normally fit in DPDK's
> extremely locked down tables).

Can we have the same performance with extended tables?
Maybe you just want to implement your own tables.

> 2) The checksum operations are kind of a hodgepodge and don't always have a
> consistent vision to them... some things like the 16-bit-based IP checksum
> appear to be missing any routine, including any accelerated one when the
> offload doesn't work (like for ICMPv4, ICMPv6, and any IPv6 datagrams, or
> other weird crap like IPv6 pseudo headers, even contemplating those gives me
> a headache, but at least my greenfield code for it works now).

Please detail which function is missing for which usage.

> 3) There isn't a real flexible choice of hash functions for the things which
> use hashes... for example, something which offered bidirectional programming
> of the Flow Director hash algo by stock / default (as seen in a paper one of
> the Intel guys posted recently) would be super awesome.

Again, a reference to the paper would help.

-- 
Thomas

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] IPv6 Offload Capabilities
  2015-01-05  8:36   ` Thomas Monjalon
@ 2015-01-06  5:25     ` Matthew Hall
  2015-01-06  5:30       ` Matthew Hall
  2015-01-14 11:29       ` Thomas Monjalon
  0 siblings, 2 replies; 7+ messages in thread
From: Matthew Hall @ 2015-01-06  5:25 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

On Mon, Jan 05, 2015 at 09:36:54AM +0100, Thomas Monjalon wrote:
> Which checksum are you talking about? IPv6 checsum doesn't exist.

The same computation algorithm must be reused to calculate the IPV6 
Pseudoheader checksum when generating ICMPV6, UDPV6, and other L4 protocols 
whose definitions were retroactively modified to include the IPV6 
pseudoheader, that happen to use the same checksum in L4 which IP used in L3.

> > Then I went looking and DPDK doesn't offer an accelerated user-space routine
> > for it. Which seems like it could work out quite poorly for people trying to
> > use ARM and PPC where the offloads might not be present. I had to steal an
> > unaccelerated one from *BSD just to get things running until I could figure
> > out a better way, which worked right for IPv6 and ICMP datagrams so
> > everything can use 100% the same clean code.
> 
> What are you talking about?

Yeah this is referring to the IP checksum algorithm, "the ones' complement of 
the ones' complement sum of some 16-bit words". I didn't find a speedy version 
of it for manually hacking together IPV6 based frames anyplace inside DPDK.

> Can we have the same performance with extended tables?
> Maybe you just want to implement your own tables.

One thing is for sure. People using DPDK are not going to be Intel 
acceleration experts. If we were we wouldn't need to use DPDK. ;)

Therefore any table that comes with DPDK is definitely going to be using 
better optimizations than whatever we come up with on our own, not to mention 
reinventing the wheel incompatibly is a bad thing, despite that many C 
developers like to do so. ;)

I'm a security expert but I'm not an Intel-friendly hash table expert. It 
would be totally OK if the table didn't run as fast when bigger stuff was 
used, but right now big stuff is just prohibited with a bunch of hard-coded 
sizes and this seems like a bad thing.

> > 2) The checksum operations are kind of a hodgepodge and don't always have a
> > consistent vision to them... some things like the 16-bit-based IP checksum
> > appear to be missing any routine, including any accelerated one when the
> > offload doesn't work (like for ICMPv4, ICMPv6, and any IPv6 datagrams, or
> > other weird crap like IPv6 pseudo headers, even contemplating those gives me
> > a headache, but at least my greenfield code for it works now).
> 
> Please detail which function is missing for which usage.

rte_hash_crc exists, rte_hash_crc_4byte exists, there is no rte_hash_ip_cksum 
to use when checksum offloading doesn't work for some reason (in BSD it's 
called in_cksum). The jhash and CRC API's don't look to be consistent / 
compatible. An expandable API with some enum of hash algorithms and a standard 
calling convention for accelerated / special algorithms (like ones which 
assume 4-byte input) would make this more generic.

> > 3) There isn't a real flexible choice of hash functions for the things which
> > use hashes... for example, something which offered bidirectional programming
> > of the Flow Director hash algo by stock / default (as seen in a paper one of
> > the Intel guys posted recently) would be super awesome.
> 
> Again, a reference to the paper would help.

http://www.ndsl.kaist.edu/~shinae/papers/TR-symRSS.pdf

Mentioned by jim at netgate.com (Jim Thompson) .

To sum up the paper, there is a special way to set up the Flow Director hash, 
which barely changes packet evenness from the default setting, which will get 
both directions of L4 flows routed into the same CPU cores.

But the larger architectural point was my proposed goal that all of the 
various kinds of hashes (flow hashes, checksums / packet hashes, table lookup 
hashes, etc.) could use a consistent pluggable API so we could easily move 
back and forth between them and write clean consistent code any time a hash is 
being used.

Matthew.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] IPv6 Offload Capabilities
  2015-01-06  5:25     ` Matthew Hall
@ 2015-01-06  5:30       ` Matthew Hall
  2015-01-14 11:29       ` Thomas Monjalon
  1 sibling, 0 replies; 7+ messages in thread
From: Matthew Hall @ 2015-01-06  5:30 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

On Mon, Jan 05, 2015 at 09:25:37PM -0800, Matthew Hall wrote:
> The same computation algorithm must be reused to calculate the IPV6 
> Pseudoheader checksum when generating ICMPV6, UDPV6, and other L4 protocols 
> whose definitions were retroactively modified to include the IPV6 
> pseudoheader, that happen to use the same checksum in L4 which IP used in L3.

To clarify, this is the part of the RFC which mentions it:

https://tools.ietf.org/html/rfc2460#section-8.1

Also, somebody else mentioned using TSO (TCP Segmentation Offload).

I did look at it but since it only seemed to work in TCP if I read everything 
right, that'd mean I had inconsistent code for IPv4 versus IPv6 stack, and 
inconsistent behavior for TCP from that for ICMP and UDP.

I was trying to avoid writing too much of this messy code if possible.

Matthew.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] IPv6 Offload Capabilities
  2015-01-06  5:25     ` Matthew Hall
  2015-01-06  5:30       ` Matthew Hall
@ 2015-01-14 11:29       ` Thomas Monjalon
  1 sibling, 0 replies; 7+ messages in thread
From: Thomas Monjalon @ 2015-01-14 11:29 UTC (permalink / raw)
  To: Matthew Hall; +Cc: dev

Hi Matthew,

2015-01-05 21:25, Matthew Hall:
> > > 2) The checksum operations are kind of a hodgepodge and don't always have a
> > > consistent vision to them... some things like the 16-bit-based IP checksum
> > > appear to be missing any routine, including any accelerated one when the
> > > offload doesn't work (like for ICMPv4, ICMPv6, and any IPv6 datagrams, or
> > > other weird crap like IPv6 pseudo headers, even contemplating those gives me
> > > a headache, but at least my greenfield code for it works now).
> > 
> > Please detail which function is missing for which usage.
> 
> rte_hash_crc exists, rte_hash_crc_4byte exists, there is no rte_hash_ip_cksum 
> to use when checksum offloading doesn't work for some reason (in BSD it's 
> called in_cksum). The jhash and CRC API's don't look to be consistent / 
> compatible. An expandable API with some enum of hash algorithms and a standard 
> calling convention for accelerated / special algorithms (like ones which 
> assume 4-byte input) would make this more generic.

[...]

> But the larger architectural point was my proposed goal that all of the 
> various kinds of hashes (flow hashes, checksums / packet hashes, table lookup 
> hashes, etc.) could use a consistent pluggable API so we could easily move 
> back and forth between them and write clean consistent code any time a hash is 
> being used.

Thank you for your detailed comments.
Are you saying that you want to work on such hash API for DPDK?

-- 
Thomas

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-01-14 11:30 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-05  7:56 [dpdk-dev] IPv6 Offload Capabilities Gal Sagie
2015-01-05  8:09 ` Matthew Hall
2015-01-05  8:36   ` Thomas Monjalon
2015-01-06  5:25     ` Matthew Hall
2015-01-06  5:30       ` Matthew Hall
2015-01-14 11:29       ` Thomas Monjalon
2015-01-05  8:33 ` Olivier MATZ

DPDK patches and discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ https://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git