* [dpdk-dev] DPDK Hash library
@ 2015-07-01 23:56 Abdul, Jaffar
2015-07-02 9:36 ` Bruce Richardson
0 siblings, 1 reply; 8+ messages in thread
From: Abdul, Jaffar @ 2015-07-01 23:56 UTC (permalink / raw)
To: dev
Hi,
I am wondering how can I use the hash library if I don't know the number of entries in the bucket (number of entries in the bucket can grow dynamically)
I am trying to use the DPDK hash library for MAC table where I can't give the fixed number of elements in each bucket.
Please let me know if you have suggestions on this topic
Thanks in advance!
Thanks
Jaffar
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] DPDK Hash library
2015-07-01 23:56 [dpdk-dev] DPDK Hash library Abdul, Jaffar
@ 2015-07-02 9:36 ` Bruce Richardson
2015-07-02 11:20 ` Dumitrescu, Cristian
2015-07-02 17:37 ` Abdul, Jaffar
0 siblings, 2 replies; 8+ messages in thread
From: Bruce Richardson @ 2015-07-02 9:36 UTC (permalink / raw)
To: Abdul, Jaffar; +Cc: dev
On Wed, Jul 01, 2015 at 07:56:28PM -0400, Abdul, Jaffar wrote:
> Hi,
>
> I am wondering how can I use the hash library if I don't know the number of entries in the bucket (number of entries in the bucket can grow dynamically)
> I am trying to use the DPDK hash library for MAC table where I can't give the fixed number of elements in each bucket.
The current DPDK hash library does not support this, unfortunately. There is
currently an effort underway to do a cuckoo hash implementation for DPDK (patches
already on-list), to allow items to move between buckets rather than just failing
if a bucket is full. Please feel free to try out these patches and provide feedback
on them if you can!
/Bruce
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] DPDK Hash library
2015-07-02 9:36 ` Bruce Richardson
@ 2015-07-02 11:20 ` Dumitrescu, Cristian
2015-07-02 16:30 ` Matthew Hall
2015-07-02 17:37 ` Abdul, Jaffar
1 sibling, 1 reply; 8+ messages in thread
From: Dumitrescu, Cristian @ 2015-07-02 11:20 UTC (permalink / raw)
To: Richardson, Bruce, Abdul, Jaffar; +Cc: dev
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Bruce Richardson
> Sent: Thursday, July 2, 2015 10:36 AM
> To: Abdul, Jaffar
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] DPDK Hash library
>
> On Wed, Jul 01, 2015 at 07:56:28PM -0400, Abdul, Jaffar wrote:
> > Hi,
> >
> > I am wondering how can I use the hash library if I don't know the number
> of entries in the bucket (number of entries in the bucket can grow
> dynamically)
> > I am trying to use the DPDK hash library for MAC table where I can't give
> the fixed number of elements in each bucket.
>
> The current DPDK hash library does not support this, unfortunately. There is
> currently an effort underway to do a cuckoo hash implementation for DPDK
> (patches
> already on-list), to allow items to move between buckets rather than just
> failing
> if a bucket is full. Please feel free to try out these patches and provide
> feedback
> on them if you can!
>
> /Bruce
> >
To achieve this, you can also use the extendible bucket hash tables provided by librte_table.
API is in lib/librte_table/rte_table_hash.h:
1. rte_table_hash_ext = Extendible bucket hash table accepting any key size
2. rte_table_hash_key8_ext = Extendible bucket hash table accepting only a key size of 8 bytes (optimized for 8-byte keys)
3. rte_table_hash_key16_ext = Extendible bucket hash table accepting only a key size of 16 bytes (optimized for 16-byte keys)
Extendible bucket means that each bucket starts with table space for only 4 keys; whenever a new key needs to be added to a bucket that already has 4 keys, the bucket gets extended with space for 4 more keys, so basically the bucket size goes up and down as keys are added or removed from a bucket.
The tables in librte_table can be used standalone or in correlation with the pipeline construction provided by librte_pipeline. For more details on these hash table, please look into DPDK Programmer's Guide, chapter Packet Framework, section 24.4.3 Hash Table Design.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] DPDK Hash library
2015-07-02 11:20 ` Dumitrescu, Cristian
@ 2015-07-02 16:30 ` Matthew Hall
0 siblings, 0 replies; 8+ messages in thread
From: Matthew Hall @ 2015-07-02 16:30 UTC (permalink / raw)
To: Abdul, Jaffar, dev
On Jul 2, 2015, at 4:20 AM, Dumitrescu, Cristian <cristian.dumitrescu@intel.com> wrote:
> I am wondering how can I use the hash library if I don't know the number
> of entries in the bucket (number of entries in the bucket can grow
> dynamically)
> I am trying to use the DPDK hash library for MAC table where I can't give
> the fixed number of elements in each bucket.
Another thing to keep in mind. DPDK hashes have an extremely large number of shallow buckets and pretty good spreading hash functions, This is for performance reasons to minimize the amount of cacheline loads. The chances of evicting buckets in this sort of hash table aren't really all that high so worrying about this might be overkill.
If you want a good quality hash table for weird stuff like variable-length keys, deeper buckets to guarantee items are preserved, etc., what I ended up doing was combining uthash with jemalloc in my app (you could also use rte_malloc but I didn't want to complicate the app too much until I get it feature complete and begin tuning it).
https://troydhanson.github.io/uthash/
https://github.com/troydhanson/uthash
http://www.canonware.com/jemalloc/
https://github.com/jemalloc/jemalloc
Matthew.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] DPDK Hash library
2015-07-02 9:36 ` Bruce Richardson
2015-07-02 11:20 ` Dumitrescu, Cristian
@ 2015-07-02 17:37 ` Abdul, Jaffar
2015-07-02 17:55 ` De Lara Guarch, Pablo
1 sibling, 1 reply; 8+ messages in thread
From: Abdul, Jaffar @ 2015-07-02 17:37 UTC (permalink / raw)
To: Bruce Richardson; +Cc: dev
HI Bruce,
Thanks for your inputs. I am wondering why Cuckoo hash is pushing the entry into new bucket why not add a new entry in the same bucket ?
Do we see this is better or faster compared to creating one more new entry in the same bucket
Thanks
Jaffar
-----Original Message-----
From: Bruce Richardson [mailto:bruce.richardson@intel.com]
Sent: Thursday, July 02, 2015 2:36 AM
To: Abdul, Jaffar
Cc: dev@dpdk.org
Subject: Re: [dpdk-dev] DPDK Hash library
On Wed, Jul 01, 2015 at 07:56:28PM -0400, Abdul, Jaffar wrote:
> Hi,
>
> I am wondering how can I use the hash library if I don't know the
> number of entries in the bucket (number of entries in the bucket can grow dynamically) I am trying to use the DPDK hash library for MAC table where I can't give the fixed number of elements in each bucket.
The current DPDK hash library does not support this, unfortunately. There is currently an effort underway to do a cuckoo hash implementation for DPDK (patches already on-list), to allow items to move between buckets rather than just failing if a bucket is full. Please feel free to try out these patches and provide feedback on them if you can!
/Bruce
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] DPDK Hash library
2015-07-02 17:37 ` Abdul, Jaffar
@ 2015-07-02 17:55 ` De Lara Guarch, Pablo
2015-07-02 19:26 ` Matthew Hall
0 siblings, 1 reply; 8+ messages in thread
From: De Lara Guarch, Pablo @ 2015-07-02 17:55 UTC (permalink / raw)
To: Abdul, Jaffar, Richardson, Bruce; +Cc: dev
Hi Jaffar,
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Abdul, Jaffar
> Sent: Thursday, July 02, 2015 6:38 PM
> To: Richardson, Bruce
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] DPDK Hash library
>
> HI Bruce,
>
> Thanks for your inputs. I am wondering why Cuckoo hash is pushing the entry
> into new bucket why not add a new entry in the same bucket ?
>
> Do we see this is better or faster compared to creating one more new entry
> in the same bucket
You are probably talking about extendable buckets here.
The downsize of that approach is that you have to allocate memory on the fly,
whereas with the cuckoo hash implementation, the entry can be stored in an alternative bucket
without having to reserve more memory (which also will take you more time).
With this approach, hash tables can get a higher utilization, as other less used
buckets can be used to store keys from other busier buckets.
Pablo
>
> Thanks
> Jaffar
>
>
> -----Original Message-----
> From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> Sent: Thursday, July 02, 2015 2:36 AM
> To: Abdul, Jaffar
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] DPDK Hash library
>
> On Wed, Jul 01, 2015 at 07:56:28PM -0400, Abdul, Jaffar wrote:
> > Hi,
> >
> > I am wondering how can I use the hash library if I don't know the
> > number of entries in the bucket (number of entries in the bucket can grow
> dynamically) I am trying to use the DPDK hash library for MAC table where I
> can't give the fixed number of elements in each bucket.
>
> The current DPDK hash library does not support this, unfortunately. There is
> currently an effort underway to do a cuckoo hash implementation for DPDK
> (patches already on-list), to allow items to move between buckets rather
> than just failing if a bucket is full. Please feel free to try out these patches
> and provide feedback on them if you can!
>
> /Bruce
> >
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] DPDK Hash library
2015-07-02 17:55 ` De Lara Guarch, Pablo
@ 2015-07-02 19:26 ` Matthew Hall
2015-07-02 21:02 ` Abdul, Jaffar
0 siblings, 1 reply; 8+ messages in thread
From: Matthew Hall @ 2015-07-02 19:26 UTC (permalink / raw)
To: De Lara Guarch, Pablo; +Cc: dev
On Thu, Jul 02, 2015 at 05:55:20PM +0000, De Lara Guarch, Pablo wrote:
> You are probably talking about extendable buckets here.
> The downsize of that approach is that you have to allocate memory on the fly,
> whereas with the cuckoo hash implementation, the entry can be stored in an alternative bucket
> without having to reserve more memory (which also will take you more time).
> With this approach, hash tables can get a higher utilization, as other less used
> buckets can be used to store keys from other busier buckets.
>
> Pablo
Expanding and shrinking buckets constantly can also be concurrency-hostile,
and is a lot more complicated to get right than just using a good rehash
algorithm and a nice static hunk of memory on contiguous hugepages for minimal
TLB / cache pressure.
If you want to do these more complex manipulations uthash is probably a better
route. But it will be slower than the DPDK hashes by quite a ways I think. I
used DPDK hash for my TCP socket table where everything is a very predictable
size, but I had to use uthash for my unpredictably sized byte buffers for
security indicators (IP, URL, Domain, Email, File Hash, etc.)
Of course, when you do this kind of stuff in your app it is going to give you
scaling problems and you'll have to spend a lot of time tuning it.
Matthew.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] DPDK Hash library
2015-07-02 19:26 ` Matthew Hall
@ 2015-07-02 21:02 ` Abdul, Jaffar
0 siblings, 0 replies; 8+ messages in thread
From: Abdul, Jaffar @ 2015-07-02 21:02 UTC (permalink / raw)
To: Matthew Hall, De Lara Guarch, Pablo; +Cc: dev
HI
Thanks guys for helping me on this one! Your suggestions are very good and it is very helpful!
Thanks
Jaffar
-----Original Message-----
From: Matthew Hall [mailto:mhall@mhcomputing.net]
Sent: Thursday, July 02, 2015 12:27 PM
To: De Lara Guarch, Pablo
Cc: Abdul, Jaffar; Richardson, Bruce; dev@dpdk.org
Subject: Re: [dpdk-dev] DPDK Hash library
On Thu, Jul 02, 2015 at 05:55:20PM +0000, De Lara Guarch, Pablo wrote:
> You are probably talking about extendable buckets here.
> The downsize of that approach is that you have to allocate memory on
> the fly, whereas with the cuckoo hash implementation, the entry can be
> stored in an alternative bucket without having to reserve more memory (which also will take you more time).
> With this approach, hash tables can get a higher utilization, as other
> less used buckets can be used to store keys from other busier buckets.
>
> Pablo
Expanding and shrinking buckets constantly can also be concurrency-hostile, and is a lot more complicated to get right than just using a good rehash algorithm and a nice static hunk of memory on contiguous hugepages for minimal TLB / cache pressure.
If you want to do these more complex manipulations uthash is probably a better route. But it will be slower than the DPDK hashes by quite a ways I think. I used DPDK hash for my TCP socket table where everything is a very predictable size, but I had to use uthash for my unpredictably sized byte buffers for security indicators (IP, URL, Domain, Email, File Hash, etc.)
Of course, when you do this kind of stuff in your app it is going to give you scaling problems and you'll have to spend a lot of time tuning it.
Matthew.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2015-07-02 21:02 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-01 23:56 [dpdk-dev] DPDK Hash library Abdul, Jaffar
2015-07-02 9:36 ` Bruce Richardson
2015-07-02 11:20 ` Dumitrescu, Cristian
2015-07-02 16:30 ` Matthew Hall
2015-07-02 17:37 ` Abdul, Jaffar
2015-07-02 17:55 ` De Lara Guarch, Pablo
2015-07-02 19:26 ` Matthew Hall
2015-07-02 21:02 ` Abdul, Jaffar
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).