DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Wang, Yipeng1" <yipeng1.wang@intel.com>
To: Michel Machado <michel@digirati.com.br>,
	"Wiles, Keith" <keith.wiles@intel.com>,
	"Fu, Qiaobin" <qiaobinf@bu.edu>
Cc: "Richardson, Bruce" <bruce.richardson@intel.com>,
	"De Lara Guarch, Pablo" <pablo.de.lara.guarch@intel.com>,
	"dev@dpdk.org" <dev@dpdk.org>,
	"Doucette, Cody, Joseph" <doucette@bu.edu>,
	"Gobriel, Sameh" <sameh.gobriel@intel.com>,
	"Tai, Charlie" <charlie.tai@intel.com>
Subject: Re: [dpdk-dev] [PATCH] hash table: add a bucket iterator function
Date: Wed, 1 Aug 2018 01:40:06 +0000	[thread overview]
Message-ID: <D2C4A16CA39F7F4E8E384D204491D7A66148315C@ORSMSX105.amr.corp.intel.com> (raw)
In-Reply-To: <8a2eb96f-bf40-a6af-c22c-072f915e063e@digirati.com.br>

Hi,

Thanks for explaining the use case. I now understand and I agree that replacing stale entries during insertion is an interesting and useful addition to this library. 

My concern is still that it may be not a good practice to expose the bucketized structure of the rte_hash. For example, in future, rte_hash may use three hash functions, or as I mentioned each bucket may have an additional linked list or even a second level hash table, or if the hopscotch hash replaces cuckoo hash as the new algorithm. When these happen, the proposed API may lose their original intention. As a general hash table library, common users should assume a black box of the implementation.

For the iteration function, I agree on the advantage over current entry by entry iteration. But it exposes the bucket index as well.

How about an API that is more universal? For example, an API such as "rte_iterate_conflict_entries".  After an insertion failure, this function will iterate all entries that may conflict with the newly inserted key and you could decide which entry to evict? 

Thanks
Yipeng

>-----Original Message-----
>From: Michel Machado [mailto:michel@digirati.com.br]
>Sent: Tuesday, July 31, 2018 8:33 AM
>To: Wiles, Keith <keith.wiles@intel.com>; Fu, Qiaobin <qiaobinf@bu.edu>
>Cc: Wang, Yipeng1 <yipeng1.wang@intel.com>; Richardson, Bruce <bruce.richardson@intel.com>; De Lara Guarch, Pablo
><pablo.de.lara.guarch@intel.com>; dev@dpdk.org; Doucette, Cody, Joseph <doucette@bu.edu>; Gobriel, Sameh
><sameh.gobriel@intel.com>; Tai, Charlie <charlie.tai@intel.com>
>Subject: Re: [dpdk-dev] [PATCH] hash table: add a bucket iterator function
>
>On 07/31/2018 10:57 AM, Wiles, Keith wrote:
>>
>>> On Jul 31, 2018, at 1:09 AM, Fu, Qiaobin <qiaobinf@bu.edu> wrote:
>>>
>>> Hi Yipeng,
>>>
>>> Thanks for the feedbacks!
>>>
>>>> On Jul 30, 2018, at 4:24 PM, Wang, Yipeng1 <yipeng1.wang@intel.com> wrote:
>>>>
>>>> Hi, Qiaobin,
>>>>
>>>> Thanks for the patch. If I understand correctly your use case is to use hash table as a "cache" that new entries should evict stale
>ones automatically. Could you be more specific on your use scenarios so that I can have more context?
>>>>
>>>
>>> Actually, we didn’t use hash table as a “cache”, and there is no automation to evict stale ones. Instead, the
>functionrte_hash_bucket_iterate() allows the callers to iterate over the hash table in an incremental way, i.e., one bucket by another
>bucket, so that the caller doesn’t have to iterate over the whole hash table in one time. This way can avoid creating hiccups and
>achieve better cache performance. One possible use scenario is when DDoS attacks happen, one may want to take more CPU time to
>process the packets, thus iterating over the hash table incrementally is desirable.
>>
>> I do not see this as a cache, but only a way to iterate over the hash table.
>>
>>>
>>>> We are actually working on an extendable version of rte_hash which will link extra buckets to current table if the table is full. As
>long as the table is sized appropriately, there will be no conflict issues thus you don’t need to replace old entries. Will this solve your
>issue? Also, if the “cache” mode is what you want, we have the membership library “rte_member”. Is it applicable to your case?
>>>
>>> I agree that adding extra buckets to current table when the table is full can alleviate this scenario. Indeed, we also considered this
>solution before coming up our proposed solution. However, it’s still highly desirable to have a bucket iterator function. Considering
>the scenario where the machine’s memory is limited and cannot always allocate new memory to the hash table (e.g., DDoS attacks,
>switch measurement tasks, etc.), a solution is to allow the callers evict some less important (the criteria for the eviction is based on the
>caller’s needs) entries.
>>>
>>> Indeed, we don’t have the “cache” mode, our implementation is a way to achieve better cache performance. So, the rte_member
>won’t help here.
>>>
>>>>
>>>> w.r.t. the patch set you proposed, my concern is that the new API will expose too much internals to the user, e.g. the
>bucket/entries structure should be implementation dependent. Exposing internal structures would prevent improving the
>implementation down the road unless we keep changing API which is not recommended.
>>>>
>>>
>>> The functions we add here give a way for the callers to iterate over the specific bucket, which potentially contains the entry. These
>functions can be made general enough to allow callers to heuristically evict some entries, and add the new ones to the hash table.
>Otherwise, there is no way to evict some less important entries.
>>
>> We have many other iterators in DPDK and this one is no different. If we can hide the internals, then it would be good. The callback
>function should not expose any internals only the value in the hash table, which I believe is do just that, right?
>
>    The use case that led to this iterator is the following: a hash
>table of flows is overloaded due to a DoS attack. The easy option is to
>ignore new flows, but this is not optimal when there's information
>pointing out that a given new flow is more important than one flow in
>the bucket in which the new flow must be inserted. So the high-level
>interpretation for this iterator is to find out which are the candidates
>such that one must be dropped to add a new flow. That is why the patch
>adds rte_hash_get_primary_bucket() and rte_hash_get_secondary_bucket().
>We don't need more than these candidates because the memory lookups
>would be overwhelming. In fact, we have found that the eight candidates
>that rte_hash_get_primary_bucket() gives is already good enough.
>
>    Once we solved the problem above, we've noticed that with small
>adjustments of the code, it would make it easy to scan a hash table for
>stale entries one bucket at a time instead of an entry at a time (the
>regular iterator). The idea is that once one reads the bucket location,
>the information about the all entries is coming together into the
>processor cache, so we can take advantage of the information that is
>already there.
>
>    While the second feature is good to have, the first one is the
>feature we must have in our software.
>
>[ ]'s
>Michel Machado

  reply	other threads:[~2018-08-01  1:40 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-28 17:48 Qiaobin Fu
2018-07-29 13:17 ` Wiles, Keith
     [not found]   ` <D2C4A16CA39F7F4E8E384D204491D7A66148250A@ORSMSX105.amr.corp.intel.com>
2018-07-31  6:09     ` Fu, Qiaobin
2018-07-31 14:57       ` Wiles, Keith
2018-07-31 15:32         ` Michel Machado
2018-08-01  1:40           ` Wang, Yipeng1 [this message]
2018-08-01 12:57             ` Michel Machado
2018-08-03 15:24               ` Stephen Hemminger
2018-08-07 13:24                 ` Michel Machado
  -- strict thread matches above, loose matches on Subject: below --
2018-07-15 17:15 Qiaobin Fu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=D2C4A16CA39F7F4E8E384D204491D7A66148315C@ORSMSX105.amr.corp.intel.com \
    --to=yipeng1.wang@intel.com \
    --cc=bruce.richardson@intel.com \
    --cc=charlie.tai@intel.com \
    --cc=dev@dpdk.org \
    --cc=doucette@bu.edu \
    --cc=keith.wiles@intel.com \
    --cc=michel@digirati.com.br \
    --cc=pablo.de.lara.guarch@intel.com \
    --cc=qiaobinf@bu.edu \
    --cc=sameh.gobriel@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).