From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mta123.f1.k8.com.br (mta123.f1.k8.com.br [187.73.32.199]) by dpdk.org (Postfix) with ESMTP id 5F124F72 for ; Tue, 31 Jul 2018 17:33:07 +0200 (CEST) Received: from [192.168.1.4] (pool-173-48-214-200.bstnma.fios.verizon.net [173.48.214.200]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtpz.f1.k8.com.br (Postfix) with ESMTPSA id 55D2560076; Tue, 31 Jul 2018 15:32:56 +0000 (UTC) X-DKIM: OpenDKIM Filter v2.6.8 smtpz.f1.k8.com.br 55D2560076 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=digirati.com.br; s=default; t=1533051185; bh=3WbEqSgcpnekBftZ4psPq97D5WjJmScdZvzVW0I9eNw=; h=Subject:To:Cc:References:From:Date:In-Reply-To; b=b9Ct5yiT4HW6mAMYSYAQRu6OJ9C/dslXHUbM3tDwoTqGcbja8O+RicO6PJSr/b13+ uC9b4CxtmnQFcQfydQDwcSwGdqkG2a80p47VFHYIvPTj/ooVhjjG/zSQMENvgqi6ey ReEhv3HY+xrYNWezMDnXlbycB60j4GU0oXjzYGjc= To: "Wiles, Keith" , "Fu, Qiaobin" Cc: "Wang, Yipeng1" , "Richardson, Bruce" , "De Lara Guarch, Pablo" , "dev@dpdk.org" , "Doucette, Cody, Joseph" , "Gobriel, Sameh" , "Tai, Charlie" References: <20180728174851.46422-1-qiaobinf@bu.edu> <5CCC4983-8B4F-480B-B6E1-3A01806BC416@intel.com> <7E249D4F-3CAA-4EDB-936D-4DFAEA55F0D8@bu.edu> From: Michel Machado Organization: Digirati Internet LTDA. Message-ID: <8a2eb96f-bf40-a6af-c22c-072f915e063e@digirati.com.br> Date: Tue, 31 Jul 2018 11:32:53 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Mailman-Approved-At: Wed, 01 Aug 2018 11:47:26 +0200 Subject: Re: [dpdk-dev] [PATCH] hash table: add a bucket iterator function X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 31 Jul 2018 15:33:07 -0000 On 07/31/2018 10:57 AM, Wiles, Keith wrote: > >> On Jul 31, 2018, at 1:09 AM, Fu, Qiaobin wrote: >> >> Hi Yipeng, >> >> Thanks for the feedbacks! >> >>> On Jul 30, 2018, at 4:24 PM, Wang, Yipeng1 wrote: >>> >>> Hi, Qiaobin, >>> >>> Thanks for the patch. If I understand correctly your use case is to use hash table as a "cache" that new entries should evict stale ones automatically. Could you be more specific on your use scenarios so that I can have more context? >>> >> >> Actually, we didn’t use hash table as a “cache”, and there is no automation to evict stale ones. Instead, the functionrte_hash_bucket_iterate() allows the callers to iterate over the hash table in an incremental way, i.e., one bucket by another bucket, so that the caller doesn’t have to iterate over the whole hash table in one time. This way can avoid creating hiccups and achieve better cache performance. One possible use scenario is when DDoS attacks happen, one may want to take more CPU time to process the packets, thus iterating over the hash table incrementally is desirable. > > I do not see this as a cache, but only a way to iterate over the hash table. > >> >>> We are actually working on an extendable version of rte_hash which will link extra buckets to current table if the table is full. As long as the table is sized appropriately, there will be no conflict issues thus you don’t need to replace old entries. Will this solve your issue? Also, if the “cache” mode is what you want, we have the membership library “rte_member”. Is it applicable to your case? >> >> I agree that adding extra buckets to current table when the table is full can alleviate this scenario. Indeed, we also considered this solution before coming up our proposed solution. However, it’s still highly desirable to have a bucket iterator function. Considering the scenario where the machine’s memory is limited and cannot always allocate new memory to the hash table (e.g., DDoS attacks, switch measurement tasks, etc.), a solution is to allow the callers evict some less important (the criteria for the eviction is based on the caller’s needs) entries. >> >> Indeed, we don’t have the “cache” mode, our implementation is a way to achieve better cache performance. So, the rte_member won’t help here. >> >>> >>> w.r.t. the patch set you proposed, my concern is that the new API will expose too much internals to the user, e.g. the bucket/entries structure should be implementation dependent. Exposing internal structures would prevent improving the implementation down the road unless we keep changing API which is not recommended. >>> >> >> The functions we add here give a way for the callers to iterate over the specific bucket, which potentially contains the entry. These functions can be made general enough to allow callers to heuristically evict some entries, and add the new ones to the hash table. Otherwise, there is no way to evict some less important entries. > > We have many other iterators in DPDK and this one is no different. If we can hide the internals, then it would be good. The callback function should not expose any internals only the value in the hash table, which I believe is do just that, right? The use case that led to this iterator is the following: a hash table of flows is overloaded due to a DoS attack. The easy option is to ignore new flows, but this is not optimal when there's information pointing out that a given new flow is more important than one flow in the bucket in which the new flow must be inserted. So the high-level interpretation for this iterator is to find out which are the candidates such that one must be dropped to add a new flow. That is why the patch adds rte_hash_get_primary_bucket() and rte_hash_get_secondary_bucket(). We don't need more than these candidates because the memory lookups would be overwhelming. In fact, we have found that the eight candidates that rte_hash_get_primary_bucket() gives is already good enough. Once we solved the problem above, we've noticed that with small adjustments of the code, it would make it easy to scan a hash table for stale entries one bucket at a time instead of an entry at a time (the regular iterator). The idea is that once one reads the bucket location, the information about the all entries is coming together into the processor cache, so we can take advantage of the information that is already there. While the second feature is good to have, the first one is the feature we must have in our software. [ ]'s Michel Machado