DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH v2] Implement rte_memcmp with AVX/SSE instructions.
@ 2015-05-08 21:19 Ravi Kerur
  2015-05-08 21:19 ` [dpdk-dev] [PATCH v2] Implement memcmp using " Ravi Kerur
  0 siblings, 1 reply; 21+ messages in thread
From: Ravi Kerur @ 2015-05-08 21:19 UTC (permalink / raw)
  To: dev

Background:
After preliminary discussion with John (Zhihong) and Tim from Intel it was
decided that it would be beneficial to use AVX/SSE instructions for memcmp
similar to memcpy being implemeneted. In addition, we decided to use
librte_hash as a test candidate to test both functionality and performance.

Currently memcmp in librte_hash is used for key comparisons whose length
can vary and max key length is defined to 64 bytes. Preliminary tests on
memory comparison alone shows using AVX/SSE instructions takes 1/3rd
CPU ticks compared with regular memcmp function. Furthermore,
hash_perf_autotest shows better results in all categories. Please note
that memory comparison is a small portion in hash functionality and CPU
Ticks/Op is for hash operations (Add on Empty, Add update, Lookup). Only
hash lookup results are shown below. I can send complete results if
interested.

Test was conducted on Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz, Ubuntu
14.04, x86_64, 16GB DDR3 system.

PS: I would like to keep "rte_memcmp" simple with return codes

0 - match
1 - no-match

since usage in DPDK is for equality or inequality and I have not seen
any instance where less-than/greater-than comparison is needed. Hence
"if (unlikely(...))" portion in the code will probably be removed and it
will be made specific to DPDK rather than being generic.

/*************Existing code**********************************/
 *** Hash table performance test results ***
Hash Func.  , Operation      , Key size (bytes), Entries, Entries per bucket, Errors  , Avg. bucket entries, Ticks/Op.
rte_hash_crc, Lookup         , 16              , 1024   , 1                 , 10000   , 0.00               , 88.55
rte_hash_crc, Lookup         , 16              , 1024   , 2                 , 10000   , 0.00               , 99.28
rte_hash_crc, Lookup         , 16              , 1024   , 4                 , 10000   , 0.00               , 106.73
rte_hash_crc, Lookup         , 16              , 1024   , 8                 , 10000   , 0.00               , 126.99
rte_hash_crc, Lookup         , 16              , 1024   , 16                , 10000   , 0.00               , 159.80

rte_hash_crc, Lookup         , 16              , 1048576, 1                 , 51      , 0.01               , 175.23
rte_hash_crc, Lookup         , 16              , 1048576, 2                 , 2       , 0.02               , 171.24
rte_hash_crc, Lookup         , 16              , 1048576, 4                 , 0       , 0.04               , 145.48
rte_hash_crc, Lookup         , 16              , 1048576, 8                 , 0       , 0.08               , 162.35
rte_hash_crc, Lookup         , 16              , 1048576, 16                , 0       , 0.15               , 182.42

jhash       , Lookup         , 16              , 1048576, 1                 , 33      , 0.01               , 219.71
jhash       , Lookup         , 16              , 1048576, 2                 , 1       , 0.02               , 216.44
jhash       , Lookup         , 16              , 1048576, 4                 , 0       , 0.04               , 188.29
jhash       , Lookup         , 16              , 1048576, 8                 , 0       , 0.08               , 203.70
jhash       , Lookup         , 16              , 1048576, 16                , 0       , 0.15               , 229.50

/**************New AVX/SSE code******************************/
Hash Func.  , Operation      , Key size (bytes), Entries, Entries per bucket, Errors  , Avg. bucket entries, Ticks/Op.
rte_hash_crc, Lookup         , 16              , 1024   , 1                 , 10000   , 0.00               , 85.69
rte_hash_crc, Lookup         , 16              , 1024   , 2                 , 10000   , 0.00               , 93.95
rte_hash_crc, Lookup         , 16              , 1024   , 4                 , 10000   , 0.00               , 102.80
rte_hash_crc, Lookup         , 16              , 1024   , 8                 , 10000   , 0.00               , 122.60
rte_hash_crc, Lookup         , 16              , 1024   , 16                , 10000   , 0.00               , 156.58

rte_hash_crc, Lookup         , 16              , 1048576, 1                 , 41      , 0.01               , 156.84
rte_hash_crc, Lookup         , 16              , 1048576, 2                 , 0       , 0.02               , 157.90
rte_hash_crc, Lookup         , 16              , 1048576, 4                 , 0       , 0.04               , 134.92
rte_hash_crc, Lookup         , 16              , 1048576, 8                 , 0       , 0.08               , 150.99
rte_hash_crc, Lookup         , 16              , 1048576, 16                , 0       , 0.15               , 174.08

jhash       , Lookup         , 16              , 1048576, 1                 , 45      , 0.01               , 212.03
jhash       , Lookup         , 16              , 1048576, 2                 , 2       , 0.02               , 210.65
jhash       , Lookup         , 16              , 1048576, 4                 , 0       , 0.04               , 185.90
jhash       , Lookup         , 16              , 1048576, 8                 , 0       , 0.08               , 201.35
jhash       , Lookup         , 16              , 1048576, 16                , 0       , 0.15               , 223.54

Ravi Kerur (1):
  Implement memcmp using AVX/SSE instructions.

 app/test/test_hash_perf.c                          |  36 +-
 .../common/include/arch/ppc_64/rte_memcmp.h        |  62 +++
 .../common/include/arch/x86/rte_memcmp.h           | 421 +++++++++++++++++++++
 lib/librte_eal/common/include/generic/rte_memcmp.h | 131 +++++++
 lib/librte_hash/rte_hash.c                         |  59 ++-
 5 files changed, 675 insertions(+), 34 deletions(-)
 create mode 100644 lib/librte_eal/common/include/arch/ppc_64/rte_memcmp.h
 create mode 100644 lib/librte_eal/common/include/arch/x86/rte_memcmp.h
 create mode 100644 lib/librte_eal/common/include/generic/rte_memcmp.h

-- 
1.9.1

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2015-05-13 20:08 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-08 21:19 [dpdk-dev] [PATCH v2] Implement rte_memcmp with AVX/SSE instructions Ravi Kerur
2015-05-08 21:19 ` [dpdk-dev] [PATCH v2] Implement memcmp using " Ravi Kerur
2015-05-08 22:29   ` Matt Laswell
2015-05-08 22:54     ` Ravi Kerur
2015-05-08 23:25       ` Matt Laswell
2015-05-11  9:51       ` Ananyev, Konstantin
2015-05-11 17:42         ` Ravi Kerur
     [not found]           ` <2601191342CEEE43887BDE71AB9772582142E44A@irsmsx105.ger.corp.intel.com>
2015-05-11 19:35             ` Ananyev, Konstantin
2015-05-11 20:46               ` Ravi Kerur
2015-05-11 22:29                 ` Don Provan
2015-05-13  1:16                   ` Ravi Kerur
2015-05-13  9:03                     ` Bruce Richardson
2015-05-13 20:08                       ` Ravi Kerur
2015-05-13 12:21                     ` Jay Rolette
2015-05-13 20:07                       ` Ravi Kerur
     [not found]                 ` <2601191342CEEE43887BDE71AB9772582142EBB5@irsmsx105.ger.corp.intel.com>
2015-05-13 10:12                   ` Ananyev, Konstantin
2015-05-13 20:06                     ` Ravi Kerur
2015-05-12  8:13   ` Linhaifeng
2015-05-13  1:18     ` Ravi Kerur
2015-05-13  7:22       ` Linhaifeng
2015-05-13 20:00         ` Ravi Kerur

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).