DPDK patches and discussions
 help / color / mirror / Atom feed
From: Thomas Monjalon <thomas@monjalon.net>
To: Yipeng Wang <yipeng1.wang@intel.com>
Cc: dev@dpdk.org, bruce.richardson@intel.com, honnappa.nagarahalli@arm.com
Subject: Re: [dpdk-dev] [PATCH v8 0/4] hash: add extendable bucket and partial key hashing
Date: Fri, 26 Oct 2018 00:35:24 +0200	[thread overview]
Message-ID: <1717822.zS3DjFTGi4@xps> (raw)
In-Reply-To: <1540233588-202969-1-git-send-email-yipeng1.wang@intel.com>

22/10/2018 20:39, Yipeng Wang:
> This patch has dependency on another bug fix patch set:
> http://patchwork.dpdk.org/cover/45611/
> 
> This patch set makes two major optimizations over the current rte_hash
> library.
> 
> First, it adds Extendable Bucket Table feature: a new structure that can
> accommodate keys that failed to get inserted into the main hash table due to
> the unlikely event of excessive hash collisions. The hash table buckets will
> get extended using a linked list to host these keys. This new design will
> guarantee insertion of 100% of the keys for a given hash table size with
> minimal overhead. A new flag value is added for user to indicate if the
> extendable bucket feature should be enabled or not. The linked list buckets is
> similar concept to the extendable bucket hash table in packet framework.
> In details, for insertion, the linked buckets will be used to store the keys
> that fail to get in the primary and the secondary bucket and the cuckoo path
> could not find an empty location for the maximum path length (small
> probability). For lookup, the key is checked first in the primary, then the
> secondary, then if the secondary is extended the linked list is traversed
> for a possible match.
> 
> Second, the patch set changes the current hashing algorithm to be "partial-key
> hashing". Partial-key hashing is the concept from Bin Fan, et al.'s paper
> "MemC3: Compact and Concurrent MemCache with Dumber Caching and Smarter
> Hashing". Instead of storing both 32-bit signature and alternative signature
> in the bucket, we only store a small 16-bit signature and calculate the
> alternative bucket index by XORing the signature with the current bucket index.
> This doubles the hash table memory efficiency since now one bucket
> only occupies one cache line instead of two in the original design.
> 
> v7->v8:
> patchwork splits the V7 into two patch series and the first one got merged
> into V6.
> Try to post again to resolve this issue.
> 
> v6->v7:
> Fix a bug accidentally introduced in V5. In iterate function, the main table
> copmleted condition should be 
> *next == total_entries_main instead of total_entries
> 
> v5->v6:
> 1. hash: fix a typo in comment, found by Honnappa.
> 2. Fix typos in commit messages.
> 3. Add review-by and acked-by.
> 
> v4->v5:
> 1. hash: for the first commit, move back the lock and read "position" in the
> while condition as Honnappa suggested.
> 2. hash: minor coding style change (Honnappa) and commit message typo fix.
> 3. Add Review-by from Honnappa.
> 
> v3->v4:
> 1. hash: Revise commit message to be more clear for "utilization" (Honnappa)
> 2. hash: in delete key function, return bucket change to use rte_ring_sp_enqueue
> instead of rte_ring_mp_enqueue, since it is already protected inside locks.
> 3. hash: update rte_hash_iterate comments (Honnappa)
> 4. hash: Add a new commit to fix race condition in the rte_hash_iterate (Honnappa)
> 5. hash/test: during utilization test, double check rte_hash_cnt returns correct
> value (Honnappa)
> 6. hash: for partial-key-hashing commit, break the get_buckets_index function
> into three. It may make future extension easier (Honnappa)
> 7. hash: change the comment for typedef uint32_t hash_sig_t to be more clear
> to users (Honnappa)
> 
> v2->v3:
> The first four commits were separated from this patch set as another
> independent patch set:
> https://mails.dpdk.org/archives/dev/2018-September/113118.html
> 1. hash: move snprintf for ext_ring name under the ext_table condition.
> 2. hash: fix memory leak by freeing ext_buckets in rte_hash_free.
> 3. hash: after failing cuckoo path, search not only ext buckets, but also the
> secondary bucket first to see if there may be an empty location now.
> 4. hash: totally rewrote the key deleting function logic. If the deleted key was
> not in the last bucket of the linked list when ext table enabled, the last
> entry in the linked list will be placed in the vacant slot from the deleted
> key. The purpose is to compact the entries in the linked list to be more close
> to the main table. This is to make sure that not many extendable buckets are
> wasted with only one or two entries after some time of running, also benefit
> lookup speed.
> 5. Other minor coding style/comments improvements.
> 
> V1->V2:
> 1. hash: Rewrite rte_hash_get_last_bkt to be more concise.
> 2. hash: Reorder the rte_hash struct to align cache line better.
> 3. test: Minor changes in auto test to add key insertion failure check during
> iteration test.
> 4. test: Add new commit to fix read-write test non-consecutive core issue.
> 4. hash: Add a new commit to remove unnecessary code introduced by previous
> patches.
> 5. hash: Comments improvement and coding style improvements over multiple
> places.
> 
> Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
> 
> 
> Yipeng Wang (4):
>   hash: fix race condition in iterate
>   hash: add extendable bucket feature
>   test/hash: implement extendable bucket hash test
>   hash: use partial-key hashing

Applied, thanks

      parent reply	other threads:[~2018-10-25 22:35 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-22 18:39 Yipeng Wang
2018-10-22 18:39 ` [dpdk-dev] [PATCH v8 1/4] hash: fix race condition in iterate Yipeng Wang
2018-10-23  8:53   ` Bruce Richardson
2018-10-22 18:39 ` [dpdk-dev] [PATCH v8 2/4] hash: add extendable bucket feature Yipeng Wang
2018-10-23  8:55   ` Bruce Richardson
2018-10-22 18:39 ` [dpdk-dev] [PATCH v8 3/4] test/hash: implement extendable bucket hash test Yipeng Wang
2018-10-23  9:04   ` Bruce Richardson
2018-10-22 18:39 ` [dpdk-dev] [PATCH v8 4/4] hash: use partial-key hashing Yipeng Wang
2018-10-23  9:07   ` Bruce Richardson
2018-10-25 22:35 ` Thomas Monjalon [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1717822.zS3DjFTGi4@xps \
    --to=thomas@monjalon.net \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=honnappa.nagarahalli@arm.com \
    --cc=yipeng1.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).