From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 6F25642CB6; Wed, 14 Jun 2023 11:04:07 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 0356240E0F; Wed, 14 Jun 2023 11:04:07 +0200 (CEST) Received: from NAM04-BN8-obe.outbound.protection.outlook.com (mail-bn8nam04on2052.outbound.protection.outlook.com [40.107.100.52]) by mails.dpdk.org (Postfix) with ESMTP id 4374040698 for ; Tue, 13 Jun 2023 08:34:19 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Xznka+O1gX1KT2I2iJdGsYXu9xILtAJkUUoXLUrcikJwIRjXzE9d2LrUW2ud238xayfbAR0XCDiEmnWYZbw6pDIrFfn3KYeRjg6XVnNPpc19lbbVu95WC47J9ty2Unf8GB9yIYWhu0TrXjz6K32zw7pQegyKXzqEy5mUHCLNImzjp1KLDeu8hXBdyt/y34PKcl4ivQkzG7208ghvHhHDcGgHl9WOW6nmiPiSsFD0f9hLs9CSR2nHK+RLm+zyKHcgpNQGLmuC+nX7THcQ0wkuCSK2ahpABamEFfUtNHGoRwXTCp5L5QMRjNRvvXrwDWK2Bsj9Exac/26PeW8rgtmcHg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=0Z3+BZW7qeYu6fkNvs9ctpV0Ec7v20cUG97YOzpH2OI=; b=KoCmKz7WOslCPCyRWRE0GaKmwvNcD+GZqOd5r9ELcYX6eoOU1p5b+fHtCpeqUscVzcQjPSFCrRaVCnkynasmv698t/IdJ3P3xOvKguNUBzc89JX5hXcZxlcyVP1TQqSgW6AM97hXr/TW7ut2N1XqWWn8gLqfzEGLVAgNcVdef1OMAoJarfLF/o8yumxYudJmspjBPupG8MrUXLKh0c8Ol/g30jJU1x/y5L1KZzFXWh8lsUPx9dAEQ2cjfZVwq9o3VsNTQkHxYjjtu0ISkOi4lrkh5AXOZkgQk/aVnKaf5cEKj8dNStdu8hZvwhiCXPSii/4MWNK2v5rnj1u6aGU4zA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=mellanox.com; dmarc=pass action=none header.from=mellanox.com; dkim=pass header.d=mellanox.com; arc=none Received: from DM4PR12MB5184.namprd12.prod.outlook.com (2603:10b6:5:397::18) by CH3PR12MB9195.namprd12.prod.outlook.com (2603:10b6:610:1a3::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6455.33; Tue, 13 Jun 2023 06:34:17 +0000 Received: from DM4PR12MB5184.namprd12.prod.outlook.com ([fe80::8ea1:de89:aab1:1d06]) by DM4PR12MB5184.namprd12.prod.outlook.com ([fe80::8ea1:de89:aab1:1d06%4]) with mapi id 15.20.6477.028; Tue, 13 Jun 2023 06:34:17 +0000 From: Bing Zhao To: Stephen Hemminger CC: "yipeng1.wang@intel.com" , "sameh.gobriel@intel.com" , "bruce.richardson@intel.com" , "pablo.de.lara.guarch@intel.com" , "dev@dpdk.org" , "NBU-Contact-Thomas Monjalon (EXTERNAL)" Subject: RE: [dpdk-dev] [RFC] hash: introduce resizable hash list Thread-Topic: [dpdk-dev] [RFC] hash: introduce resizable hash list Thread-Index: AQHWVYv/99yAaV7NKkyARKf9YVwa4q+N7ywAgADmCmA= Date: Tue, 13 Jun 2023 06:34:17 +0000 Message-ID: References: <1566975109-318949-1-git-send-email-bingz@mellanox.com> <20230612094420.3a3971c0@hermes.local> In-Reply-To: <20230612094420.3a3971c0@hermes.local> Accept-Language: en-US, zh-CN Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=mellanox.com; x-ms-exchange-messagesentrepresentingtype: 1 x-ms-publictraffictype: Email x-ms-traffictypediagnostic: DM4PR12MB5184:EE_|CH3PR12MB9195:EE_ x-ms-office365-filtering-correlation-id: 5b584c40-1c46-4968-02b0-08db6bd8366d x-ld-processed: 43083d15-7273-40c1-b7db-39efd9ccc17a,ExtAddr x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: I+stpYz0yaSlXA44/ePI94aKQieQ0WUVD0wRvlhW2zYTAwmjf5dkvhE+HHiOSUH9M5yizZmtsAih3et1Kfq3edP1LYebnDDCfYaQKL7tU62fAsoS+qnxdwj7RNc/6BhofR43ZPrHdJOoaKFjU5rnt4vpL/bPlTMj7nodRSpW+d5KHMwC0YAI/P/ZtwYqtbjvzttHRpdWNcYJaHxLAhtOkyYrZvNnj4h8nHi7hLVnB34//3mJXt4U6uRFpF1EM3ufb5HU4O1eb2x+ZCF98xZqeSULBEbM3atcFpOOtclW9doXAztvRS2BvGX/8Dnsn3u7umiPAjLv9PGE0hQrRoSXvya1M+VFQGmySUa33MDZ1cIsP6HcIuoSacabGrNQQuzR7hk+8cCe4wbMROm+SC6SeErwukR7MVSCylx83Q3SCb6DxVsfd/q6Lc3kwcdetDYan2OlGhZzrxYvUNikpP4garQPLYLUh92z4zaRQ2Mskwf3SKAeZ9HlEMsKHmSaoR/dVpLwOXxyGXy+7jSRN7VRzONgMT9PM3MNtVtgvdW/TtTHTeBD9i01PAtQ2Dhgu678cYmsh9roisN6otjQP73QMruRPtEF3cPXhyqm98kKGxqO/XK8v4oYt0Ys3bq1Vz61Fk1ThRZJqhTOAFxBzPT80aD1G5OA40FYPAL/cmHeh0w= x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:DM4PR12MB5184.namprd12.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230028)(4636009)(376002)(39860400002)(366004)(136003)(346002)(396003)(451199021)(53546011)(9686003)(26005)(966005)(6506007)(83380400001)(66574015)(186003)(122000001)(66946007)(64756008)(66556008)(2906002)(66476007)(76116006)(66446008)(41300700001)(7696005)(71200400001)(8936002)(38070700005)(38100700002)(8676002)(5660300002)(54906003)(52536014)(33656002)(55016003)(478600001)(86362001)(316002)(6916009)(4326008); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?us-ascii?Q?g0hTBElPUEq32ogVi9z0K3eSsJZKWBt/WjM7aLL4+VTKdUuY54Av176ngIup?= =?us-ascii?Q?eXALBdJkvRv4LDrOnV8M4v4odO7BLDCNzFjAqLiMYGNWlotgzA6giR9GXaW4?= =?us-ascii?Q?H/T9X3hfvhXZ/dGDJ93XnfSbQrE0RQa0PS5gJBj6CxpELt0V9tS4rS0nDJBu?= =?us-ascii?Q?9rxrsjZbohQTcM6CWn2y9lwE/Tz5HruZhHQQbGDKyNAYfpT1k2QNkcDwEhg0?= =?us-ascii?Q?BEOvZ8t/lfPol9drlvwZA4l+M20XQ4dg4AL0H8Vgd5K5V32UlzONxBMgWgJk?= =?us-ascii?Q?qLid06TEVWB6Fnxv11K/wuDsftELFnLash5c6sP28eOjDC8vdPFmA4wIY9wY?= =?us-ascii?Q?jzJl/CTWyJOZTbYmZrYbpDr+H6zEDUYJZfbMOBOgTmd38jYsWySLwaGCtQ83?= =?us-ascii?Q?M6WtcboliFnEoTh1qFXBvDTJ+zPPTp7ah4mZKTWsyOsFSs7Mav7PjP4BLvdx?= =?us-ascii?Q?LnqFba8IG2HWLpL1ZXOkZHU7DMQuNtjltnAE41qUZE3iC7oYI/vnR8D3T0q5?= =?us-ascii?Q?zkkmPqxP55ihMmZffIZQuuHILKGAhfXBqc5TpIqpd+LemQQ5j066B78k1wjj?= =?us-ascii?Q?UYPRrqy/q/H8dmXUSJsEDYYXP7uUtnYhDsK+gPf4oX0PdljHVRsbBYAVtTe8?= =?us-ascii?Q?vE1qmkR23OL213sj+BezvDWpo/QBQ3jIirum7oh5O67dgkT/PYpU6t5cXGUW?= =?us-ascii?Q?H7+QIpjzhEXhQMxRBleObKYFrkAOWvK9hfXz9T32kFSEdMA9113QdRMXQDrf?= =?us-ascii?Q?hHu6kdDxd5ocX8q+wTgMfgtrojfg+08YBqTWxGrMCZ0ESy2kRRaIeomfmS5Z?= =?us-ascii?Q?3RQKeoeAk8qrFzBJyAnd6aX4WiofKCJBz4VJ958WoZqIBqjFGQVGCTjJ0k4r?= =?us-ascii?Q?ElYdZvE/1aZ01wVeU95B5yEjDh/A7uolvkIwhxIkVDYQLAEevL80dgRU3H2S?= =?us-ascii?Q?TnOg1vUp9LT42dSB3tmDc03tZYLVmmN4DmoaoDSPvEQZmNXDya2b/wqDIuku?= =?us-ascii?Q?wHB5T9vuD4UTj2qP/iEjZgoijODZ09lNHfuUu+sxePUo5JUHgtcwXwcCsc+0?= =?us-ascii?Q?DQlr2S0o6XwyIgqCi/Y/AaB+mkiS/o7+9XfzX65Ma3QfUUkAJIdVEyu9Gs73?= =?us-ascii?Q?9KptSXAw+J8RGhouXFb5EbEcYNMEIG3GWXRYyw0YSnjke+2KaiA3JOcU+54u?= =?us-ascii?Q?+HYNVgFMKqpf1cn5IBWsYN82TVWOsNKmcEpr9PlsLH8uBSWp8SDu26Qv/kkv?= =?us-ascii?Q?xr6BJSWJGdkhpZ13l5MA6ZFYQS067NTO9vzeeBgdWC2MGLullq9C5669XV52?= =?us-ascii?Q?mdlc+Gv4ne9Y0uS1e7iHvhxyM+83QX1AHwKMhI7mLFmKlp6+bNXzWAvmFhu+?= =?us-ascii?Q?FL0ep8ncSUyteYsE4iYAQgwLm2rblpYBUlzy5rfN4bo1q2p5ZdtXG4afyobg?= =?us-ascii?Q?D16sy0ZPlvP77+PGchg491QHSFL9YxI3cyurt5dqNAnFavDm2Ri8pbmRWj+F?= =?us-ascii?Q?WEN98hzy02oSPxOGHmIuH0CGCXayUkAXJLOj4X8AX87bETVJ+s2q64CmXH4s?= =?us-ascii?Q?g8VlCQtleG2JNwfumvU=3D?= Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: mellanox.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: DM4PR12MB5184.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 5b584c40-1c46-4968-02b0-08db6bd8366d X-MS-Exchange-CrossTenant-originalarrivaltime: 13 Jun 2023 06:34:17.2789 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: UDICYUAhMpwrzrEuDGYbQpfilpm+4Bud2v4NGo0vF+yzCeVbusdTcVPdp5LRHdYBtiPWw96xnYabmhuIJNubVQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: CH3PR12MB9195 X-Mailman-Approved-At: Wed, 14 Jun 2023 11:04:06 +0200 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Hi Stephen, > -----Original Message----- > From: Stephen Hemminger > Sent: Tuesday, June 13, 2023 12:44 AM > To: Bing Zhao > Cc: yipeng1.wang@intel.com; sameh.gobriel@intel.com; > bruce.richardson@intel.com; pablo.de.lara.guarch@intel.com; dev@dpdk.org > Subject: Re: [dpdk-dev] [RFC] hash: introduce resizable hash list >=20 > On Wed, 28 Aug 2019 14:51:48 +0800 > Bing Zhao wrote: >=20 > > In the current hash library, there are already two hash tables and > > several hash calculation functions. > > > > FBK hash table is very lightweight, but the key length is limited to > > 4 bytes and the size of the table is fixed during startup. > > > > Cuckoo hash table is a quite great idea and nice implementation in the > > library, it is efficient and flexible. After some study of the code > > and information from internet, I find that the table extension is some > > pain point (correct me if anything wrong). It means that we need to > > allocate the tables in advance by defining a maximum size. > > So there is a chance that we waste some unused memory. Right now there > > is an extendable buckets table, but it seems that the number is also > > fixed and the same as the primary table. > > Take the flows information for example, we may support as many as > > several millions of flows. In some case, only several hundreds of > > flows will be created but in the other case, millions of flows are > > needed. So this means that we need to create the hash table to support > > millions of elements in advance during the startup time. If one key of > > flow information will consume 64B and 1M flows, so it will occupy more > > than one hundred MB memory (the table fields included). What is worse > > if there is only 2M + several elements, it needs to create a table of > > 4M (or more: depends on the hash collision rate) and it is some > > wasting of the memory. > > > > In order to handle this, an resizable hash list is introduced. > > The table could start with a small number of the elements and be > > allocated dynamically during the runtime. In the meanwhile, an > > on-demand list splitting mechanism is used in order not to make a > > significant performance degradation. Then there is no need to re-hash > > and relocate all the existing elements in the table when the table is > > extended. > > > > The brief design is as below: > > 1. The table is consisted of LIST header array and the LIST entries. > > In each entry, a pre-calculated hash signature is stored and is > > used to decide which header should it be linked to, by using > > "AND" with the mask to select the LSBs of the signature. > > 2. The header array size could start with a very small number, and a > > specified max number of each list is used to check if a table > > extension is needed. > > 3. When the max number on any of list is reached, the header array > > size will be doubled. Then each entries linked to this list will > > be split into two lists with the new mask (one more bit 1 in > > the mask, e.g. 0xfff -> 0x1fff). And a global shift number and > > local shift number of each list is used for the further checking. > > 4. When some other list is being accessed, a comparison for the shift > > numbers is used to check if the splitting of this list is needed. > > If so, then there will be two conditions: > > a. If the local shift number is only 1 less than global or > > non-zero, then this list is the one that needs to be split. > > b. If more than 1, then it means that the table is extended more > > than once. And If the local shift is zero, a mechanism is used > > to find the last unsplit list. > > And then the list will be split into 2/4/8... lists depends on > > the gap. All the entries then will linked to the proper header. > > So, each time when the hlist APIs are called, only one list will be > > traversed but not all the lists. And since there is parameter to set a > > max number of entries in a list. The traversal time is predictable and > > these will not cause a significant performance degradation. > > > > BR. Bing > > > > > > Bing Zhao (1): > > rte_hash: introduce hash list into hash lib > > > > lib/librte_hash/Makefile | 2 + > > lib/librte_hash/rte_hash_version.map | 15 + > > lib/librte_hash/rte_hlist.c | 687 > +++++++++++++++++++++++++++++++++++ > > lib/librte_hash/rte_hlist.h | 563 +++++++++++++++++++++++++++= + > > 4 files changed, 1267 insertions(+) > > create mode 100644 lib/librte_hash/rte_hlist.c create mode 100644 > > lib/librte_hash/rte_hlist.h > > >=20 > A resizeable hash table (with RCU?) would be good to have. > See old article about this https://lwn.net/Articles/573431/ But this patc= h got > review feedback and no follow up. > Marking it as Changes Requested, maybe someone can resuscitate it later. To my understanding, resizable hash may be needed by the application in the= real life. Sometimes the maximal entries may not be known during startup, = and it is some waste to allocate a huge amount of memory. (Probably there w= ould be some failure to insert, even) The extendable hash list would be the simplest way to go. While I am not a = researcher, maybe there is some other advanced way to go in the papers and = we need to translate it into code. Any ideas? BR. Bing