From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by dpdk.org (Postfix) with ESMTP id 2548F1B1F9 for ; Sat, 29 Sep 2018 02:50:39 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 28 Sep 2018 17:50:38 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,317,1534834800"; d="scan'208";a="94889566" Received: from fmsmsx104.amr.corp.intel.com ([10.18.124.202]) by orsmga001.jf.intel.com with ESMTP; 28 Sep 2018 17:46:17 -0700 Received: from fmsmsx157.amr.corp.intel.com (10.18.116.73) by fmsmsx104.amr.corp.intel.com (10.18.124.202) with Microsoft SMTP Server (TLS) id 14.3.319.2; Fri, 28 Sep 2018 17:46:16 -0700 Received: from fmsmsx151.amr.corp.intel.com ([169.254.7.87]) by FMSMSX157.amr.corp.intel.com ([169.254.14.71]) with mapi id 14.03.0319.002; Fri, 28 Sep 2018 17:46:16 -0700 From: "Wang, Yipeng1" To: Honnappa Nagarahalli , "Richardson, Bruce" CC: "dev@dpdk.org" , "michel@digirati.com.br" , nd Thread-Topic: [PATCH v2 0/7] hash: add extendable bucket and partial key hashing Thread-Index: AQHUUgpQ9goevBjst0qR7BsRvEtccaT8zaJggAml8eA= Date: Sat, 29 Sep 2018 00:46:15 +0000 Message-ID: References: <1536253745-133104-1-git-send-email-yipeng1.wang@intel.com> <1537550255-252066-1-git-send-email-yipeng1.wang@intel.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-version: 11.0.400.15 dlp-reaction: no-action x-ctpclassification: CTP_NT x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiZjUzYzRkMmMtZGZiNS00N2U5LTk0OTItMTA0OGU0ZGZiOTkwIiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX05UIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE3LjEwLjE4MDQuNDkiLCJUcnVzdGVkTGFiZWxIYXNoIjoic2ZiTmNjYkZDSjNtNmkzTFBkRVwveGlvNTN0TjArVjNjQkh0NDdSTkkzTWlzb0Q3bGE2d0R5WlNOSVhFUHlDRHQifQ== x-originating-ip: [10.1.200.106] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH v2 0/7] hash: add extendable bucket and partial key hashing X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 29 Sep 2018 00:50:40 -0000 >-----Original Message----- >From: Honnappa Nagarahalli [mailto:Honnappa.Nagarahalli@arm.com] >> Second, the patch set changes the current hashing algorithm to be "parti= al- >> key hashing". Partial-key hashing is the concept from Bin Fan, et al.'s = paper >> "MemC3: Compact and Concurrent MemCache with Dumber Caching and >> Smarter Hashing". >I read this paper (but not the papers in references). My understanding is = that the existing algorithm already uses 'partial-key hashing'. >This patch set is not adding the 'partial-key hashing' feature. Instead it= is reducing the size of the signature ('tag' as referred in the >paper) from 32 to 16b. >Please let me know if I have not understood this correct. [Wang, Yipeng] Currently two signature values are stored in hash table: sig= _current and sig_alt. They are used to derived the index of two alternative buckets. Partial key hashing avoids storing two values, but stores only one. The alt= ernative bucket Index is derived by XOR this single signature with the current bucket index= . So, this commit not only reduces the signature size from 32-bit to 16-bit, = it also reduces the number of signatures stored from two to one. As a result, we can use one 64-byte cache line for the bucket instead of tw= o. >> Instead of storing both 32-bit signature and alternative >> signature in the bucket, we only store a small 16-bit signature and calc= ulate >> the alternative bucket index by XORing the signature with the current bu= cket >> index. >According to the referenced paper, the signature ('tag') reduces the numbe= r of accesses to the keys, thus improving the performance. >But, if we reduce the size of the signature from 32b to 16b, it will resul= t in higher probability of false matches on the signature. This in >turn will increase the number of accesses to keys. Have you run any perfor= mance benchmarks and compared the numbers with the >existing code? Is it possible to share the numbers? > [Wang, Yipeng]=20 >>From our test it is very unlikely that two different keys would map to the = same bucket and also have the same 16-bit signature. Even we reduce the signature size from 32-bit to 16-bit, it should be very = very rare assuming a good hash function. For performance numbers, it could vary depending on the test case. Since th= e speedup comes from the 2X memory efficiency, If your hash table is large (e.g. over last level cache), it will give much= higher speedup. For the existing unit test, I generally seen 10% speedup.= =20