From: "De Lara Guarch, Pablo" <pablo.de.lara.guarch@intel.com>
To: "Richardson, Bruce" <bruce.richardson@intel.com>,
Matthew Hall <mhall@mhcomputing.net>,
"dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] Defaults for rte_hash
Date: Tue, 9 Sep 2014 11:42:40 +0000 [thread overview]
Message-ID: <E115CCD9D858EF4F90C690B0DCB4D89722614BD2@IRSMSX108.ger.corp.intel.com> (raw)
In-Reply-To: <59AF69C657FD0841A61C55336867B5B0343EFBBD@IRSMSX103.ger.corp.intel.com>
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Richardson, Bruce
> Sent: Tuesday, September 09, 2014 11:45 AM
> To: Matthew Hall; dev@dpdk.org
> Subject: Re: [dpdk-dev] Defaults for rte_hash
>
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Matthew Hall
> > Sent: Tuesday, September 09, 2014 11:32 AM
> > To: dev@dpdk.org
> > Subject: [dpdk-dev] Defaults for rte_hash
> >
> > Hello,
> >
> > I was looking at the code which inits rte_hash objects in examples/l3fwd.
> It's
> > using approx. 1M to 4M hash 'entries' depending on 32-bit vs 64-bit, but it's
> > setting the 'bucket_entries' to just 4.
> >
> > Normally I'm used to using somewhat deeper hash buckets than that... it
> seems
> > like having a zillion little tiny hash buckets would cause more TLB pressure
> > and memory overhead... or does 4 get shifted / exponentiated into 2**4 ?
> >
That 4 is not shifted, so it is actually 4 entries/bucket. Actually, the maximum number of entries you can use is 16, as bucket will be as big as a cache line.
However, regardless the number of entries, memory size will remain the same, but using 4 entries/bucket, with 16-byte key, all keys stored for a bucket will fit in a cache line,
so performance looks to be better in this case (although a non-optimal hash function could lead not to be able to store all keys, as chances to fill a bucket are higher).
Anyway, for this example, 4 entries/bucket looks a good number to me.
> > The documentation in
> > http://dpdk.org/doc/api/structrte__hash__parameters.html
> > and http://dpdk.org/doc/api/rte__hash_8h.html isn't that clear... is there a
> > better place to look for this?
> >
> > In my case I'm looking to create a table of 4M or 8M entries, containing
> > tables of security threat IPs / domains, to be detected in the traffic. So it
> > would be good to have some understanding how not to waste a ton of
> memory
> > on a
> > table this huge without making it run super slow either.
> >
> > Did anybody have some experience with how to get this right?
>
> It might be worth looking too at the hash table structures in the librte_table
> directory for packet framework. These should give better scalability across
> millions of flows than the existing rte_hash implementation. [We're looking
> here to provide in the future a similar, more scalable, hash table
> implementation with an API like that of rte_hash, but that is still under
> development here at the moment.]
>
> >
> > Another thing... the LPM table uses 16-bit Hop IDs. But I would probably
> have
> > more than 64K CIDR blocks of badness on the Internet available to me for
> > analysis. How would I cope with this, besides just letting some attackers
> > escape unnoticed? ;)
>
> Actually, I think the next hop field in the lpm implementation is only 8-bits,
> not 16 :-). Each lpm entry is only 16-bits in total.
>
> >
> > Have we got some kind of structure which allows a greater number of
> CIDRs
> > even
> > if it's not quite as fast?
> >
> > Thanks,
> > Matthew.
next prev parent reply other threads:[~2014-09-09 11:37 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-09-09 10:31 Matthew Hall
2014-09-09 10:45 ` Richardson, Bruce
2014-09-09 11:42 ` De Lara Guarch, Pablo [this message]
2014-09-09 20:42 ` Matthew Hall
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=E115CCD9D858EF4F90C690B0DCB4D89722614BD2@IRSMSX108.ger.corp.intel.com \
--to=pablo.de.lara.guarch@intel.com \
--cc=bruce.richardson@intel.com \
--cc=dev@dpdk.org \
--cc=mhall@mhcomputing.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).