DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev]  DPDK Hash library
@ 2015-07-01 23:56 Abdul, Jaffar
  2015-07-02  9:36 ` Bruce Richardson
  0 siblings, 1 reply; 8+ messages in thread
From: Abdul, Jaffar @ 2015-07-01 23:56 UTC (permalink / raw)
  To: dev

Hi,

I am wondering how can I use the hash library if I don't know the number of entries in the bucket (number of entries in the bucket can grow dynamically)
I am trying to use the DPDK hash library for MAC table where I can't give the fixed number of elements in each bucket.

Please let me know if you have suggestions on this topic 

Thanks in advance!

Thanks
Jaffar

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-dev] DPDK Hash library
  2015-07-01 23:56 [dpdk-dev] DPDK Hash library Abdul, Jaffar
@ 2015-07-02  9:36 ` Bruce Richardson
  2015-07-02 11:20   ` Dumitrescu, Cristian
  2015-07-02 17:37   ` Abdul, Jaffar
  0 siblings, 2 replies; 8+ messages in thread
From: Bruce Richardson @ 2015-07-02  9:36 UTC (permalink / raw)
  To: Abdul, Jaffar; +Cc: dev

On Wed, Jul 01, 2015 at 07:56:28PM -0400, Abdul, Jaffar wrote:
> Hi,
> 
> I am wondering how can I use the hash library if I don't know the number of entries in the bucket (number of entries in the bucket can grow dynamically)
> I am trying to use the DPDK hash library for MAC table where I can't give the fixed number of elements in each bucket.

The current DPDK hash library does not support this, unfortunately. There is
currently an effort underway to do a cuckoo hash implementation for DPDK (patches
already on-list), to allow items to move between buckets rather than just failing
if a bucket is full. Please feel free to try out these patches and provide feedback
on them if you can!

	/Bruce
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-dev] DPDK Hash library
  2015-07-02  9:36 ` Bruce Richardson
@ 2015-07-02 11:20   ` Dumitrescu, Cristian
  2015-07-02 16:30     ` Matthew Hall
  2015-07-02 17:37   ` Abdul, Jaffar
  1 sibling, 1 reply; 8+ messages in thread
From: Dumitrescu, Cristian @ 2015-07-02 11:20 UTC (permalink / raw)
  To: Richardson, Bruce, Abdul, Jaffar; +Cc: dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Bruce Richardson
> Sent: Thursday, July 2, 2015 10:36 AM
> To: Abdul, Jaffar
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] DPDK Hash library
> 
> On Wed, Jul 01, 2015 at 07:56:28PM -0400, Abdul, Jaffar wrote:
> > Hi,
> >
> > I am wondering how can I use the hash library if I don't know the number
> of entries in the bucket (number of entries in the bucket can grow
> dynamically)
> > I am trying to use the DPDK hash library for MAC table where I can't give
> the fixed number of elements in each bucket.
> 
> The current DPDK hash library does not support this, unfortunately. There is
> currently an effort underway to do a cuckoo hash implementation for DPDK
> (patches
> already on-list), to allow items to move between buckets rather than just
> failing
> if a bucket is full. Please feel free to try out these patches and provide
> feedback
> on them if you can!
> 
> 	/Bruce
> >

To achieve this, you can also use the extendible bucket hash tables provided by librte_table.

API is in lib/librte_table/rte_table_hash.h:
1. rte_table_hash_ext = Extendible bucket hash table accepting any key size
2. rte_table_hash_key8_ext = Extendible bucket hash table accepting only a key size of 8 bytes (optimized for 8-byte keys)
3. rte_table_hash_key16_ext = Extendible bucket hash table accepting only a key size of 16 bytes (optimized for 16-byte keys)

Extendible bucket means that each bucket starts with table space for only 4 keys; whenever a new key needs to be added to a bucket that already has 4 keys, the bucket gets extended with space for 4 more keys, so basically the bucket size goes up and down as keys are added or removed from a bucket.

The tables in librte_table can be used standalone or in correlation with the pipeline construction provided by librte_pipeline. For more details on these hash table, please look into DPDK Programmer's Guide, chapter Packet Framework, section 24.4.3 Hash Table Design.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-dev] DPDK Hash library
  2015-07-02 11:20   ` Dumitrescu, Cristian
@ 2015-07-02 16:30     ` Matthew Hall
  0 siblings, 0 replies; 8+ messages in thread
From: Matthew Hall @ 2015-07-02 16:30 UTC (permalink / raw)
  To: Abdul, Jaffar, dev

On Jul 2, 2015, at 4:20 AM, Dumitrescu, Cristian <cristian.dumitrescu@intel.com> wrote:
> I am wondering how can I use the hash library if I don't know the number
> of entries in the bucket (number of entries in the bucket can grow
> dynamically)
> I am trying to use the DPDK hash library for MAC table where I can't give
> the fixed number of elements in each bucket.

Another thing to keep in mind. DPDK hashes have an extremely large number of shallow buckets and pretty good spreading hash functions, This is for performance reasons to minimize the amount of cacheline loads. The chances of evicting buckets in this sort of hash table aren't really all that high so worrying about this might be overkill.

If you want a good quality hash table for weird stuff like variable-length keys, deeper buckets to guarantee items are preserved, etc., what I ended up doing was combining uthash with jemalloc in my app (you could also use rte_malloc but I didn't want to complicate the app too much until I get it feature complete and begin tuning it).

https://troydhanson.github.io/uthash/
https://github.com/troydhanson/uthash

http://www.canonware.com/jemalloc/
https://github.com/jemalloc/jemalloc

Matthew.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-dev] DPDK Hash library
  2015-07-02  9:36 ` Bruce Richardson
  2015-07-02 11:20   ` Dumitrescu, Cristian
@ 2015-07-02 17:37   ` Abdul, Jaffar
  2015-07-02 17:55     ` De Lara Guarch, Pablo
  1 sibling, 1 reply; 8+ messages in thread
From: Abdul, Jaffar @ 2015-07-02 17:37 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev

HI Bruce,

Thanks for your inputs. I am wondering why Cuckoo hash is pushing the entry into new bucket why not add a new entry in the same bucket ?

Do we see this is better or faster compared to creating one more new entry in the same bucket 

Thanks
Jaffar


-----Original Message-----
From: Bruce Richardson [mailto:bruce.richardson@intel.com] 
Sent: Thursday, July 02, 2015 2:36 AM
To: Abdul, Jaffar
Cc: dev@dpdk.org
Subject: Re: [dpdk-dev] DPDK Hash library

On Wed, Jul 01, 2015 at 07:56:28PM -0400, Abdul, Jaffar wrote:
> Hi,
> 
> I am wondering how can I use the hash library if I don't know the 
> number of entries in the bucket (number of entries in the bucket can grow dynamically) I am trying to use the DPDK hash library for MAC table where I can't give the fixed number of elements in each bucket.

The current DPDK hash library does not support this, unfortunately. There is currently an effort underway to do a cuckoo hash implementation for DPDK (patches already on-list), to allow items to move between buckets rather than just failing if a bucket is full. Please feel free to try out these patches and provide feedback on them if you can!

	/Bruce
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-dev] DPDK Hash library
  2015-07-02 17:37   ` Abdul, Jaffar
@ 2015-07-02 17:55     ` De Lara Guarch, Pablo
  2015-07-02 19:26       ` Matthew Hall
  0 siblings, 1 reply; 8+ messages in thread
From: De Lara Guarch, Pablo @ 2015-07-02 17:55 UTC (permalink / raw)
  To: Abdul, Jaffar, Richardson, Bruce; +Cc: dev

Hi Jaffar,

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Abdul, Jaffar
> Sent: Thursday, July 02, 2015 6:38 PM
> To: Richardson, Bruce
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] DPDK Hash library
> 
> HI Bruce,
> 
> Thanks for your inputs. I am wondering why Cuckoo hash is pushing the entry
> into new bucket why not add a new entry in the same bucket ?
> 
> Do we see this is better or faster compared to creating one more new entry
> in the same bucket

You are probably talking about extendable buckets here.
The downsize of that approach is that you have to allocate memory on the fly,
whereas with the cuckoo hash implementation, the entry can be stored in an alternative bucket
without having to reserve more memory (which also will take you more time).
With this approach, hash tables can get a higher utilization, as other less used
buckets can be used to store keys from other busier buckets.

Pablo
> 
> Thanks
> Jaffar
> 
> 
> -----Original Message-----
> From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> Sent: Thursday, July 02, 2015 2:36 AM
> To: Abdul, Jaffar
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] DPDK Hash library
> 
> On Wed, Jul 01, 2015 at 07:56:28PM -0400, Abdul, Jaffar wrote:
> > Hi,
> >
> > I am wondering how can I use the hash library if I don't know the
> > number of entries in the bucket (number of entries in the bucket can grow
> dynamically) I am trying to use the DPDK hash library for MAC table where I
> can't give the fixed number of elements in each bucket.
> 
> The current DPDK hash library does not support this, unfortunately. There is
> currently an effort underway to do a cuckoo hash implementation for DPDK
> (patches already on-list), to allow items to move between buckets rather
> than just failing if a bucket is full. Please feel free to try out these patches
> and provide feedback on them if you can!
> 
> 	/Bruce
> >

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-dev] DPDK Hash library
  2015-07-02 17:55     ` De Lara Guarch, Pablo
@ 2015-07-02 19:26       ` Matthew Hall
  2015-07-02 21:02         ` Abdul, Jaffar
  0 siblings, 1 reply; 8+ messages in thread
From: Matthew Hall @ 2015-07-02 19:26 UTC (permalink / raw)
  To: De Lara Guarch, Pablo; +Cc: dev

On Thu, Jul 02, 2015 at 05:55:20PM +0000, De Lara Guarch, Pablo wrote:
> You are probably talking about extendable buckets here.
> The downsize of that approach is that you have to allocate memory on the fly,
> whereas with the cuckoo hash implementation, the entry can be stored in an alternative bucket
> without having to reserve more memory (which also will take you more time).
> With this approach, hash tables can get a higher utilization, as other less used
> buckets can be used to store keys from other busier buckets.
> 
> Pablo

Expanding and shrinking buckets constantly can also be concurrency-hostile, 
and is a lot more complicated to get right than just using a good rehash 
algorithm and a nice static hunk of memory on contiguous hugepages for minimal 
TLB / cache pressure.

If you want to do these more complex manipulations uthash is probably a better 
route. But it will be slower than the DPDK hashes by quite a ways I think. I 
used DPDK hash for my TCP socket table where everything is a very predictable 
size, but I had to use uthash for my unpredictably sized byte buffers for 
security indicators (IP, URL, Domain, Email, File Hash, etc.)

Of course, when you do this kind of stuff in your app it is going to give you 
scaling problems and you'll have to spend a lot of time tuning it.

Matthew.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-dev] DPDK Hash library
  2015-07-02 19:26       ` Matthew Hall
@ 2015-07-02 21:02         ` Abdul, Jaffar
  0 siblings, 0 replies; 8+ messages in thread
From: Abdul, Jaffar @ 2015-07-02 21:02 UTC (permalink / raw)
  To: Matthew Hall, De Lara Guarch, Pablo; +Cc: dev

HI

Thanks guys for helping me on this one! Your suggestions are very good and it is very helpful!

Thanks
Jaffar


-----Original Message-----
From: Matthew Hall [mailto:mhall@mhcomputing.net] 
Sent: Thursday, July 02, 2015 12:27 PM
To: De Lara Guarch, Pablo
Cc: Abdul, Jaffar; Richardson, Bruce; dev@dpdk.org
Subject: Re: [dpdk-dev] DPDK Hash library

On Thu, Jul 02, 2015 at 05:55:20PM +0000, De Lara Guarch, Pablo wrote:
> You are probably talking about extendable buckets here.
> The downsize of that approach is that you have to allocate memory on 
> the fly, whereas with the cuckoo hash implementation, the entry can be 
> stored in an alternative bucket without having to reserve more memory (which also will take you more time).
> With this approach, hash tables can get a higher utilization, as other 
> less used buckets can be used to store keys from other busier buckets.
> 
> Pablo

Expanding and shrinking buckets constantly can also be concurrency-hostile, and is a lot more complicated to get right than just using a good rehash algorithm and a nice static hunk of memory on contiguous hugepages for minimal TLB / cache pressure.

If you want to do these more complex manipulations uthash is probably a better route. But it will be slower than the DPDK hashes by quite a ways I think. I used DPDK hash for my TCP socket table where everything is a very predictable size, but I had to use uthash for my unpredictably sized byte buffers for security indicators (IP, URL, Domain, Email, File Hash, etc.)

Of course, when you do this kind of stuff in your app it is going to give you scaling problems and you'll have to spend a lot of time tuning it.

Matthew.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-07-02 21:02 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-01 23:56 [dpdk-dev] DPDK Hash library Abdul, Jaffar
2015-07-02  9:36 ` Bruce Richardson
2015-07-02 11:20   ` Dumitrescu, Cristian
2015-07-02 16:30     ` Matthew Hall
2015-07-02 17:37   ` Abdul, Jaffar
2015-07-02 17:55     ` De Lara Guarch, Pablo
2015-07-02 19:26       ` Matthew Hall
2015-07-02 21:02         ` Abdul, Jaffar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).