DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] cuckoo hash in dpdk
@ 2017-08-23 14:15 Pragash Vijayaragavan
  2017-08-23 18:28 ` Dumitrescu, Cristian
  2017-08-24 14:53 ` Andriy Berestovskyy
  0 siblings, 2 replies; 7+ messages in thread
From: Pragash Vijayaragavan @ 2017-08-23 14:15 UTC (permalink / raw)
  To: dev; +Cc: Minseok Kwon

Hi,

I got the chance to look at the cuckoo hash used in dpdk and have a query.

would using division and modulo operations be slower than bitwise
operations on RTE_HASH_BUCKET_ENTRIES, specially since
RTE_HASH_BUCKET_ENTRIES is a power of 2.
For example, to do a modulo we can do a "AND" operation on
(RTE_HASH_BUCKET_ENTRIES - 1), which might be faster. We did a cuckoo
filter for VPP and doing this gave a slight improvement in speed.
Is there any particular reason its done this way.

Sorry if i am being wrong in any way, i was just curious.

Thanks,

Pragash Vijayaragavan
Grad Student at Rochester Institute of Technology
email : pxv3620@rit.edu
ph : 585 764 4662

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] cuckoo hash in dpdk
  2017-08-23 14:15 [dpdk-dev] cuckoo hash in dpdk Pragash Vijayaragavan
@ 2017-08-23 18:28 ` Dumitrescu, Cristian
  2017-08-23 20:20   ` Pragash Vijayaragavan
  2017-08-24 14:53 ` Andriy Berestovskyy
  1 sibling, 1 reply; 7+ messages in thread
From: Dumitrescu, Cristian @ 2017-08-23 18:28 UTC (permalink / raw)
  To: pxv3620, dev; +Cc: Minseok Kwon



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Pragash
> Vijayaragavan
> Sent: Wednesday, August 23, 2017 3:16 PM
> To: dev@dpdk.org
> Cc: Minseok Kwon <mxkvcs@rit.edu>
> Subject: [dpdk-dev] cuckoo hash in dpdk
> 
> Hi,
> 
> I got the chance to look at the cuckoo hash used in dpdk and have a query.
> 
> would using division and modulo operations be slower than bitwise
> operations on RTE_HASH_BUCKET_ENTRIES, specially since
> RTE_HASH_BUCKET_ENTRIES is a power of 2.
> For example, to do a modulo we can do a "AND" operation on
> (RTE_HASH_BUCKET_ENTRIES - 1), which might be faster. We did a cuckoo
> filter for VPP and doing this gave a slight improvement in speed.
> Is there any particular reason its done this way.
> 
> Sorry if i am being wrong in any way, i was just curious.
> 
> Thanks,
> 
> Pragash Vijayaragavan
> Grad Student at Rochester Institute of Technology
> email : pxv3620@rit.edu
> ph : 585 764 4662

Bitwise AND typically takes 1 cycle on any CPU, while modulo takes dozens of cycles.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] cuckoo hash in dpdk
  2017-08-23 18:28 ` Dumitrescu, Cristian
@ 2017-08-23 20:20   ` Pragash Vijayaragavan
  2017-08-23 20:39     ` Pragash Vijayaragavan
  0 siblings, 1 reply; 7+ messages in thread
From: Pragash Vijayaragavan @ 2017-08-23 20:20 UTC (permalink / raw)
  To: Dumitrescu, Cristian; +Cc: dev, Minseok Kwon

Hi ,

The performance will depend on the time taken for calculating the hash1 and
hash2 values for each lookup.

Can i know which hash functions are used to calculate the hash values for
each incoming key. We use CRC32 which uses xxhash i guess.
I could not find the implementation of the hash functions.


Thanks,

Pragash Vijayaragavan
Grad Student at Rochester Institute of Technology
email : pxv3620@rit.edu
ph : 585 764 4662


On Wed, Aug 23, 2017 at 2:28 PM, Dumitrescu, Cristian <
cristian.dumitrescu@intel.com> wrote:

>
>
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Pragash
> > Vijayaragavan
> > Sent: Wednesday, August 23, 2017 3:16 PM
> > To: dev@dpdk.org
> > Cc: Minseok Kwon <mxkvcs@rit.edu>
> > Subject: [dpdk-dev] cuckoo hash in dpdk
> >
> > Hi,
> >
> > I got the chance to look at the cuckoo hash used in dpdk and have a
> query.
> >
> > would using division and modulo operations be slower than bitwise
> > operations on RTE_HASH_BUCKET_ENTRIES, specially since
> > RTE_HASH_BUCKET_ENTRIES is a power of 2.
> > For example, to do a modulo we can do a "AND" operation on
> > (RTE_HASH_BUCKET_ENTRIES - 1), which might be faster. We did a cuckoo
> > filter for VPP and doing this gave a slight improvement in speed.
> > Is there any particular reason its done this way.
> >
> > Sorry if i am being wrong in any way, i was just curious.
> >
> > Thanks,
> >
> > Pragash Vijayaragavan
> > Grad Student at Rochester Institute of Technology
> > email : pxv3620@rit.edu
> > ph : 585 764 4662
>
> Bitwise AND typically takes 1 cycle on any CPU, while modulo takes dozens
> of cycles.
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] cuckoo hash in dpdk
  2017-08-23 20:20   ` Pragash Vijayaragavan
@ 2017-08-23 20:39     ` Pragash Vijayaragavan
  0 siblings, 0 replies; 7+ messages in thread
From: Pragash Vijayaragavan @ 2017-08-23 20:39 UTC (permalink / raw)
  To: Dumitrescu, Cristian; +Cc: dev, Minseok Kwon

And yea, mod will take only 1 Clock cycle that way :)

Also if divisor is a power of 2,  a / b  can be done like

 ( Given a is unsigned, a > b and since our hash values are greater than 0,
a is our hash value, b the power of 2)

while(b != 1) {
   a >>= 1;
   b >>= 1;
}

 which is less clock cycles than normal division i guess :)




Thanks,

Pragash Vijayaragavan
Grad Student at Rochester Institute of Technology
email : pxv3620@rit.edu
ph : 585 764 4662


On Wed, Aug 23, 2017 at 4:20 PM, Pragash Vijayaragavan <pxv3620@g.rit.edu>
wrote:

> Hi ,
>
> The performance will depend on the time taken for calculating the hash1
> and hash2 values for each lookup.
>
> Can i know which hash functions are used to calculate the hash values for
> each incoming key. We use CRC32 which uses xxhash i guess.
> I could not find the implementation of the hash functions.
>
>
> Thanks,
>
> Pragash Vijayaragavan
> Grad Student at Rochester Institute of Technology
> email : pxv3620@rit.edu
> ph : 585 764 4662 <(585)%20764-4662>
>
>
> On Wed, Aug 23, 2017 at 2:28 PM, Dumitrescu, Cristian <
> cristian.dumitrescu@intel.com> wrote:
>
>>
>>
>> > -----Original Message-----
>> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Pragash
>> > Vijayaragavan
>> > Sent: Wednesday, August 23, 2017 3:16 PM
>> > To: dev@dpdk.org
>> > Cc: Minseok Kwon <mxkvcs@rit.edu>
>> > Subject: [dpdk-dev] cuckoo hash in dpdk
>> >
>> > Hi,
>> >
>> > I got the chance to look at the cuckoo hash used in dpdk and have a
>> query.
>> >
>> > would using division and modulo operations be slower than bitwise
>> > operations on RTE_HASH_BUCKET_ENTRIES, specially since
>> > RTE_HASH_BUCKET_ENTRIES is a power of 2.
>> > For example, to do a modulo we can do a "AND" operation on
>> > (RTE_HASH_BUCKET_ENTRIES - 1), which might be faster. We did a cuckoo
>> > filter for VPP and doing this gave a slight improvement in speed.
>> > Is there any particular reason its done this way.
>> >
>> > Sorry if i am being wrong in any way, i was just curious.
>> >
>> > Thanks,
>> >
>> > Pragash Vijayaragavan
>> > Grad Student at Rochester Institute of Technology
>> > email : pxv3620@rit.edu
>> > ph : 585 764 4662
>>
>> Bitwise AND typically takes 1 cycle on any CPU, while modulo takes dozens
>> of cycles.
>>
>
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] cuckoo hash in dpdk
  2017-08-23 14:15 [dpdk-dev] cuckoo hash in dpdk Pragash Vijayaragavan
  2017-08-23 18:28 ` Dumitrescu, Cristian
@ 2017-08-24 14:53 ` Andriy Berestovskyy
  2017-08-24 18:54   ` Pragash Vijayaragavan
  1 sibling, 1 reply; 7+ messages in thread
From: Andriy Berestovskyy @ 2017-08-24 14:53 UTC (permalink / raw)
  To: pxv3620; +Cc: DPDK, Minseok Kwon

Hey Pragash,
I am not the author of the code, but I guess it is done that way
because modern compilers do recognize power of two constants and do
substitute division and modulo operations with corresponding bit
manipulations.

Just try to compile a small program like the following:

volatile unsigned a = 123, b, c;
int main(int argc, char **argv)
{
    b = a / 4;
    c = a % 4;
    printf("%x %x %x\n", a, b, c);
}


and then disassemble it with gdb:

(gdb) disassemble /s main
[...]
13 b = a / 4;
   0x0000000000400464 <+20>: shr    $0x2,%eax
   0x0000000000400467 <+23>: mov    %eax,0x200bd3(%rip)        # 0x601040 <b>

14 c = a % 4;
   0x000000000040046d <+29>: mov    0x200bc5(%rip),%eax        # 0x601038 <a>
   0x0000000000400473 <+35>: and    $0x3,%eax
   0x0000000000400476 <+38>: mov    %eax,0x200bc8(%rip)        # 0x601044 <c>
[...]

As you can see both division and modulo was substituted with "shr" and "and".

So basically nowadays there is no need to worry about that and
complicate code with explicit low-level optimizations. Hope that
answers your question.

Regards,
Andriy


On Wed, Aug 23, 2017 at 4:15 PM, Pragash Vijayaragavan <pxv3620@rit.edu> wrote:
> Hi,
>
> I got the chance to look at the cuckoo hash used in dpdk and have a query.
>
> would using division and modulo operations be slower than bitwise
> operations on RTE_HASH_BUCKET_ENTRIES, specially since
> RTE_HASH_BUCKET_ENTRIES is a power of 2.
> For example, to do a modulo we can do a "AND" operation on
> (RTE_HASH_BUCKET_ENTRIES - 1), which might be faster. We did a cuckoo
> filter for VPP and doing this gave a slight improvement in speed.
> Is there any particular reason its done this way.
>
> Sorry if i am being wrong in any way, i was just curious.
>
> Thanks,
>
> Pragash Vijayaragavan
> Grad Student at Rochester Institute of Technology
> email : pxv3620@rit.edu
> ph : 585 764 4662



-- 
Andriy Berestovskyy

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] cuckoo hash in dpdk
  2017-08-24 14:53 ` Andriy Berestovskyy
@ 2017-08-24 18:54   ` Pragash Vijayaragavan
  2017-08-25  9:00     ` Andriy Berestovskyy
  0 siblings, 1 reply; 7+ messages in thread
From: Pragash Vijayaragavan @ 2017-08-24 18:54 UTC (permalink / raw)
  To: Andriy Berestovskyy; +Cc: dev, Minseok Kwon

Thats great, what about the hash functions.
On 24 Aug 2017 10:54, "Andriy Berestovskyy" <aber@semihalf.com> wrote:

> Hey Pragash,
> I am not the author of the code, but I guess it is done that way
> because modern compilers do recognize power of two constants and do
> substitute division and modulo operations with corresponding bit
> manipulations.
>
> Just try to compile a small program like the following:
>
> volatile unsigned a = 123, b, c;
> int main(int argc, char **argv)
> {
>     b = a / 4;
>     c = a % 4;
>     printf("%x %x %x\n", a, b, c);
> }
>
>
> and then disassemble it with gdb:
>
> (gdb) disassemble /s main
> [...]
> 13 b = a / 4;
>    0x0000000000400464 <+20>: shr    $0x2,%eax
>    0x0000000000400467 <+23>: mov    %eax,0x200bd3(%rip)        # 0x601040
> <b>
>
> 14 c = a % 4;
>    0x000000000040046d <+29>: mov    0x200bc5(%rip),%eax        # 0x601038
> <a>
>    0x0000000000400473 <+35>: and    $0x3,%eax
>    0x0000000000400476 <+38>: mov    %eax,0x200bc8(%rip)        # 0x601044
> <c>
> [...]
>
> As you can see both division and modulo was substituted with "shr" and
> "and".
>
> So basically nowadays there is no need to worry about that and
> complicate code with explicit low-level optimizations. Hope that
> answers your question.
>
> Regards,
> Andriy
>
>
> On Wed, Aug 23, 2017 at 4:15 PM, Pragash Vijayaragavan <pxv3620@rit.edu>
> wrote:
> > Hi,
> >
> > I got the chance to look at the cuckoo hash used in dpdk and have a
> query.
> >
> > would using division and modulo operations be slower than bitwise
> > operations on RTE_HASH_BUCKET_ENTRIES, specially since
> > RTE_HASH_BUCKET_ENTRIES is a power of 2.
> > For example, to do a modulo we can do a "AND" operation on
> > (RTE_HASH_BUCKET_ENTRIES - 1), which might be faster. We did a cuckoo
> > filter for VPP and doing this gave a slight improvement in speed.
> > Is there any particular reason its done this way.
> >
> > Sorry if i am being wrong in any way, i was just curious.
> >
> > Thanks,
> >
> > Pragash Vijayaragavan
> > Grad Student at Rochester Institute of Technology
> > email : pxv3620@rit.edu
> > ph : 585 764 4662
>
>
>
> --
> Andriy Berestovskyy
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] cuckoo hash in dpdk
  2017-08-24 18:54   ` Pragash Vijayaragavan
@ 2017-08-25  9:00     ` Andriy Berestovskyy
  0 siblings, 0 replies; 7+ messages in thread
From: Andriy Berestovskyy @ 2017-08-25  9:00 UTC (permalink / raw)
  To: pxv3620; +Cc: DPDK, users

Hey Pragash,
You can pass your own hash function to rte_hash_create() otherwise a
default one will be used, see
http://dpdk.org/browse/dpdk/tree/lib/librte_hash/rte_cuckoo_hash.c#n281

The default hash function is rte_hash_crc() or in some cases rte_jhash(), see
http://dpdk.org/browse/dpdk/tree/lib/librte_hash/rte_cuckoo_hash.h#n61

You can find the implementation of rte_hash_crc() over here:
http://dpdk.org/browse/dpdk/tree/lib/librte_hash/rte_hash_crc.h#n588



Please note there is a separate mailing list for DPDK usage discussions:
http://dpdk.org/ml/listinfo/users

The dev@ list is mostly for patch reviews and RFCs...


Andriy

On Thu, Aug 24, 2017 at 8:54 PM, Pragash Vijayaragavan <pxv3620@rit.edu> wrote:
> Thats great, what about the hash functions.
>
> On 24 Aug 2017 10:54, "Andriy Berestovskyy" <aber@semihalf.com> wrote:
>>
>> Hey Pragash,
>> I am not the author of the code, but I guess it is done that way
>> because modern compilers do recognize power of two constants and do
>> substitute division and modulo operations with corresponding bit
>> manipulations.
>>
>> Just try to compile a small program like the following:
>>
>> volatile unsigned a = 123, b, c;
>> int main(int argc, char **argv)
>> {
>>     b = a / 4;
>>     c = a % 4;
>>     printf("%x %x %x\n", a, b, c);
>> }
>>
>>
>> and then disassemble it with gdb:
>>
>> (gdb) disassemble /s main
>> [...]
>> 13 b = a / 4;
>>    0x0000000000400464 <+20>: shr    $0x2,%eax
>>    0x0000000000400467 <+23>: mov    %eax,0x200bd3(%rip)        # 0x601040
>> <b>
>>
>> 14 c = a % 4;
>>    0x000000000040046d <+29>: mov    0x200bc5(%rip),%eax        # 0x601038
>> <a>
>>    0x0000000000400473 <+35>: and    $0x3,%eax
>>    0x0000000000400476 <+38>: mov    %eax,0x200bc8(%rip)        # 0x601044
>> <c>
>> [...]
>>
>> As you can see both division and modulo was substituted with "shr" and
>> "and".
>>
>> So basically nowadays there is no need to worry about that and
>> complicate code with explicit low-level optimizations. Hope that
>> answers your question.
>>
>> Regards,
>> Andriy
>>
>>
>> On Wed, Aug 23, 2017 at 4:15 PM, Pragash Vijayaragavan <pxv3620@rit.edu>
>> wrote:
>> > Hi,
>> >
>> > I got the chance to look at the cuckoo hash used in dpdk and have a
>> > query.
>> >
>> > would using division and modulo operations be slower than bitwise
>> > operations on RTE_HASH_BUCKET_ENTRIES, specially since
>> > RTE_HASH_BUCKET_ENTRIES is a power of 2.
>> > For example, to do a modulo we can do a "AND" operation on
>> > (RTE_HASH_BUCKET_ENTRIES - 1), which might be faster. We did a cuckoo
>> > filter for VPP and doing this gave a slight improvement in speed.
>> > Is there any particular reason its done this way.
>> >
>> > Sorry if i am being wrong in any way, i was just curious.
>> >
>> > Thanks,
>> >
>> > Pragash Vijayaragavan
>> > Grad Student at Rochester Institute of Technology
>> > email : pxv3620@rit.edu
>> > ph : 585 764 4662
>>
>>
>>
>> --
>> Andriy Berestovskyy



-- 
Andriy Berestovskyy

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-08-25  9:00 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-23 14:15 [dpdk-dev] cuckoo hash in dpdk Pragash Vijayaragavan
2017-08-23 18:28 ` Dumitrescu, Cristian
2017-08-23 20:20   ` Pragash Vijayaragavan
2017-08-23 20:39     ` Pragash Vijayaragavan
2017-08-24 14:53 ` Andriy Berestovskyy
2017-08-24 18:54   ` Pragash Vijayaragavan
2017-08-25  9:00     ` Andriy Berestovskyy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).