From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 9209CA04DD; Wed, 21 Oct 2020 04:42:50 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 0911CAC76; Wed, 21 Oct 2020 04:42:49 +0200 (CEST) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by dpdk.org (Postfix) with ESMTP id 3E200AC6D for ; Wed, 21 Oct 2020 04:42:46 +0200 (CEST) IronPort-SDR: UM6MBUdsG33Kozk9Ex+wNlghbU+PuD1dOLlyuOj+RRkkrDI30ROXXIaXVbINJBWj2FBQjq9UpV 3LBJBdmxQ53g== X-IronPort-AV: E=McAfee;i="6000,8403,9780"; a="231498888" X-IronPort-AV: E=Sophos;i="5.77,399,1596524400"; d="scan'208";a="231498888" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Oct 2020 19:42:44 -0700 IronPort-SDR: k7kIypA9jraO1vkU8v6jz0ob/WaLK3YVAFpSNRbcY85HZustVkAl3206sX+r4ZtEEOohQ1V2CC rmsBYiltQEpg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,399,1596524400"; d="scan'208";a="301938688" Received: from fmsmsx603.amr.corp.intel.com ([10.18.126.83]) by fmsmga007.fm.intel.com with ESMTP; 20 Oct 2020 19:42:43 -0700 Received: from fmsmsx611.amr.corp.intel.com (10.18.126.91) by fmsmsx603.amr.corp.intel.com (10.18.126.83) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Tue, 20 Oct 2020 19:42:43 -0700 Received: from fmsmsx603.amr.corp.intel.com (10.18.126.83) by fmsmsx611.amr.corp.intel.com (10.18.126.91) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Tue, 20 Oct 2020 19:42:43 -0700 Received: from fmsedg602.ED.cps.intel.com (10.1.192.136) by fmsmsx603.amr.corp.intel.com (10.18.126.83) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5 via Frontend Transport; Tue, 20 Oct 2020 19:42:43 -0700 Received: from NAM10-DM6-obe.outbound.protection.outlook.com (104.47.58.109) by edgegateway.intel.com (192.55.55.71) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.1713.5; Tue, 20 Oct 2020 19:42:36 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Og1jZSzC3e68mBHqp9ihMPwJUs/uVm3+gjKzPz7VR0lCjWLIIJu2xHAMyuRQzGSuiOXjq/S4ZJ7ku1j9PVMDbdh9c3NRj32o/VNkStnlzXeigNxXbqOQqPfDNdc4rOTkWxyw8tdAq40brNBot4pKy8C93aQgnkdJCQFQ8I3NP5vERYKE7JO6j487W2pjttUnBAUNS/TGdUqudoHThgW3ZZYxBs/kq9ormAW/H8Ub49lGxCSnVFC7gW1tYH6HeOKGmH9Msq9JgXhLyvqM3OTGDR/lcYG529NK/oD7U7/35bVHDglwT6O/L4p4MDh+chtvgmkMSKvoQ7rarzUUgVG7aQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=QFagHdeO6yPh0yXDaUnet4ZH9uHQh836TcFbX24rZXc=; b=XiEz/7TX5LOFYHE78uQv5t3q7uMAeFmy5M3L9zg1uNQ7WOsv4647uGsE2qcqumyNZg9XS1biB2jARbJyk7T6vjRkbk4xiDmo5gQwt7CsKp1VxFv5Ol/cq8tgzk1DMWY5vT1HXSvTXwgN/NiK2UCmwkPHHwfHClyTznTRBbNOZc5nadhn4Rjs2rCPjpmL8ecgjIRJHlAHMLP1sgp0/ev5X5E8oa0Q89fEbZ2lCmplUupa5h4sLrKSzoZJ3xV4GHWiiplBuwIxo9l8F+Tx0r2V0BCL46LLWLCtrpG2QjRl3kywBF6zjuXFU4LM5vJC7ed26/QZtdKfnjbTIqE31qfmyg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel.onmicrosoft.com; s=selector2-intel-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=QFagHdeO6yPh0yXDaUnet4ZH9uHQh836TcFbX24rZXc=; b=fhjXnIlj0RXBZtmogaxkeXOs4TY8ERVbFC2p4/BGYbLYvQOBLv6Gt9RJjtdY5JjzTV8L7S7XXDPN3Jx1+b+xQqnkIHaLQkuj2GmB+HI6VOfbZP1rNnBLodU2vjcWtCFiYR903oPs3HW0heoUDu91opIYPYEF30ULGdVxPKKD0CA= Received: from BYAPR11MB3494.namprd11.prod.outlook.com (2603:10b6:a03:86::15) by BYAPR11MB3735.namprd11.prod.outlook.com (2603:10b6:a03:b4::31) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3477.20; Wed, 21 Oct 2020 02:42:33 +0000 Received: from BYAPR11MB3494.namprd11.prod.outlook.com ([fe80::5886:71de:a26f:8060]) by BYAPR11MB3494.namprd11.prod.outlook.com ([fe80::5886:71de:a26f:8060%4]) with mapi id 15.20.3477.028; Wed, 21 Oct 2020 02:42:33 +0000 From: "Wang, Yipeng1" To: Dharmik Thakkar , "Gobriel, Sameh" , "Richardson, Bruce" , Ray Kinsella , Neil Horman CC: "dev@dpdk.org" , "nd@arm.com" Thread-Topic: [PATCH v5 2/4] lib/hash: integrate RCU QSBR Thread-Index: AQHWpvwKqsAE8GtSn0SyXrjCy5mj7qmhCMDA Date: Wed, 21 Oct 2020 02:42:33 +0000 Message-ID: References: <20201019163519.28180-1-dharmik.thakkar@arm.com> <20201020161301.7458-1-dharmik.thakkar@arm.com> <20201020161301.7458-3-dharmik.thakkar@arm.com> In-Reply-To: <20201020161301.7458-3-dharmik.thakkar@arm.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-reaction: no-action dlp-version: 11.5.1.3 dlp-product: dlpe-windows authentication-results: arm.com; dkim=none (message not signed) header.d=none;arm.com; dmarc=none action=none header.from=intel.com; x-originating-ip: [108.161.24.24] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 997a0454-d033-4b3f-ed7a-08d8756af6a1 x-ms-traffictypediagnostic: BYAPR11MB3735: x-ld-processed: 46c98d88-e344-4ed4-8496-4ed7712e255d,ExtAddr x-ms-exchange-transport-forked: True x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:279; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: unsUpQvN5ohz3AicQmqiM7fTiKbGCAXh7HZH5tMRYGa/IzI9E9m/1ymsryACPfMbfG1sMGMaRAMylzs8nymoh1dOiXp34EGTRNlxlOIY6dfiPV7hRzel1bddUlV9CJeLRlMIe/Mv7/8cXqMmVNkGz9cEBteViMVL2RrsITU6iDeic2tW3Q1Xj/l+o9MH8GROXoPpoR/6EWhfRTrJaPCJT/asPvQRWsbWscPcJTsyGDmDbb/dHiWC14Q5Orzo8dNaO7DvpM5EBHmEJmgDyYXGO91cSjEC5Tz0y6lwVz8U+EM4ijYtIMOdAcIrdR2Oa9GTHE6Krmq0TM/IeZE7Fo6FmJ84574fvRNNfMoShKUEGXHPYNfTYgn4TS6m7VZSutyL x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:BYAPR11MB3494.namprd11.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(136003)(396003)(366004)(346002)(376002)(39860400002)(53546011)(66556008)(52536014)(26005)(2906002)(83380400001)(8676002)(33656002)(186003)(66446008)(66476007)(66946007)(4326008)(64756008)(76116006)(71200400001)(316002)(110136005)(5660300002)(54906003)(55016002)(6506007)(30864003)(86362001)(9686003)(478600001)(8936002)(7696005)(21314003)(559001)(579004); DIR:OUT; SFP:1102; x-ms-exchange-antispam-messagedata: Rt9Kmu8m+XZ8q2jQ2jOoOfZfjkfevgATwwDPf0AzFU3Pko5vJRPmQOCD740ABbv2QfYOl4Q+DOKQiRrrhNslNq/E2PC+5IclsVRNnxPhcJtkfRwDv3+5RPcRmueahEtcjS3qQvYvBAZ+KwrrcBLpLAiV+EIOvihmeGlrhZGLTgmxBRUCMNyr4obmWfj/TpTJ2tetpkq3eiUAv5qvVThV0fvaByaO2INrsw4zuW4qxasNEbUMSt7UHoZ1lJxwNcjtQ4Dc2gzUn0RAR5lxhxGJTr4G/75Led+jo/bvX0eyvxfFo2jyiXrcJKOJpJr78hv8T/X6czcgHgzPCL6mpyBcENmVgfpQ9Lz3rdlKwRLs5oI8jrV7umX97NyhDz+ZaLEimfWhHPfhN/abW6exPmuqKkP/JcADVJvFpUKDvSGxE1Hpv9wlwKEdi9o09ema1bNOv1vB3YhqaBnv3iPqbdayH8rmHViyBuX4zN+0krg/WO8gG+B7+mG2SG+y/cfrW1sEUzmg5ALlNHPXG/UI1qwaPvDnBaiGuJV+D+ypWB8IPJ4g7zn8xh9wleez4aEW6fJHk0vWF87Tg5hCZ62x56sYWmyoF3gjHxlXtkaIhxWVt1aZUFXFLBYTzM2fUihDWRIUSJwSZe71KGeHpAPZAmXLjg== Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: BYAPR11MB3494.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 997a0454-d033-4b3f-ed7a-08d8756af6a1 X-MS-Exchange-CrossTenant-originalarrivaltime: 21 Oct 2020 02:42:33.6546 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: 2zhkURsOyE8NRJVjf/HizL+kvySLPghECCM+zyXUdaGVDbI2arqBMVRkqXHjBBtpvd55GfT8STzAwQEgeubxUg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: BYAPR11MB3735 X-OriginatorOrg: intel.com Subject: Re: [dpdk-dev] [PATCH v5 2/4] lib/hash: integrate RCU QSBR X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" > -----Original Message----- > From: Dharmik Thakkar > Sent: Tuesday, October 20, 2020 9:13 AM > To: Wang, Yipeng1 ; Gobriel, Sameh > ; Richardson, Bruce > ; Ray Kinsella ; Neil Horman > > Cc: dev@dpdk.org; nd@arm.com; Dharmik Thakkar > > Subject: [PATCH v5 2/4] lib/hash: integrate RCU QSBR >=20 > Currently, users have to use external RCU mechanisms to free resources > when using lock free hash algorithm. >=20 > Integrate RCU QSBR process to make it easier for the applications to use = lock > free algorithm. > Refer to RCU documentation to understand various aspects of integrating R= CU > library into other libraries. >=20 > Suggested-by: Honnappa Nagarahalli > Signed-off-by: Dharmik Thakkar > Reviewed-by: Ruifeng Wang > Acked-by: Ray Kinsella > --- > doc/guides/prog_guide/hash_lib.rst | 11 +- > lib/librte_hash/meson.build | 1 + > lib/librte_hash/rte_cuckoo_hash.c | 302 ++++++++++++++++++++++------- > lib/librte_hash/rte_cuckoo_hash.h | 8 + > lib/librte_hash/rte_hash.h | 77 +++++++- > lib/librte_hash/version.map | 2 +- > 6 files changed, 325 insertions(+), 76 deletions(-) >=20 > diff --git a/doc/guides/prog_guide/hash_lib.rst > b/doc/guides/prog_guide/hash_lib.rst > index d06c7de2ead1..63e183ed1f08 100644 > --- a/doc/guides/prog_guide/hash_lib.rst > +++ b/doc/guides/prog_guide/hash_lib.rst > @@ -102,6 +102,9 @@ For concurrent writes, and concurrent reads and > writes the following flag values > * If the 'do not free on delete' (RTE_HASH_EXTRA_FLAGS_NO_FREE_ON_DEL) > flag is set, the position of the entry in the hash table is not freed upo= n calling > delete(). This flag is enabled > by default when the lock free read/write concurrency flag is set. The > application should free the position after all the readers have stopped > referencing the position. > Where required, the application can make use of RCU mechanisms to > determine when the readers have stopped referencing the position. > + RCU QSBR process is integrated within the Hash library for safe freei= ng of > the position. Application has certain responsibilities while using this f= eature. > + Please refer to resource reclamation framework of :ref:`RCU library > ` for more details. [Yipeng]: Maybe also add: rte_hash_rcu_qsbr_add() need to be called to use = the embedded RCU mechanism. Just to give user a pointer to which API to look. > + >=20 > Extendable Bucket Functionality support > ---------------------------------------- > @@ -109,8 +112,8 @@ An extra flag is used to enable this functionality (f= lag > is not set by default). > in the very unlikely case due to excessive hash collisions that a key ha= s failed > to be inserted, the hash table bucket is extended with a linked list to = insert > these failed keys. This feature is important for the workloads (e.g. telc= o > workloads) that need to insert up to 100% of the hash table size and can= 't > tolerate any key insertion failure (even if very few). > -Please note that with the 'lock free read/write concurrency' flag enable= d, > users need to call 'rte_hash_free_key_with_position' API in order to free= the > empty buckets and -deleted keys, to maintain the 100% capacity guarantee. > +Please note that with the 'lock free read/write concurrency' flag > +enabled, users need to call 'rte_hash_free_key_with_position' API or > configure integrated RCU QSBR (or use external RCU mechanisms) in order t= o > free the empty buckets and deleted keys, to maintain the 100% capacity > guarantee. >=20 > Implementation Details (non Extendable Bucket Case) > --------------------------------------------------- > @@ -172,7 +175,7 @@ Example of deletion: > Similar to lookup, the key is searched in its primary and secondary buck= ets. If > the key is found, the entry is marked as empty. If the hash table was > configured with 'no free on delete' or 'lock free read/write concurrency'= , the > position of the key is not freed. It is the responsibility of the user to= free the > position after -readers are not referencing the position anymore. > +readers are not referencing the position anymore. User can configure > +integrated RCU QSBR or use external RCU mechanisms to safely free the > +position on delete >=20 >=20 > Implementation Details (with Extendable Bucket) @@ -286,6 +289,8 @@ The > flow table operations on the application side are described below: > * Free flow: Free flow key position. If 'no free on delete' or 'lock-f= ree > read/write concurrency' flags are set, > wait till the readers are not referencing the position returned duri= ng > add/delete flow and then free the position. > RCU mechanisms can be used to find out when the readers are not > referencing the position anymore. > + RCU QSBR process is integrated within the Hash library for safe free= ing of > the position. Application has certain responsibilities while using this f= eature. > + Please refer to resource reclamation framework of :ref:`RCU library > ` for more details. >=20 > * Lookup flow: Lookup for the flow key in the hash. > If the returned position is valid (flow lookup hit), use the returne= d position > to access the flow entry in the flow table. > diff --git a/lib/librte_hash/meson.build b/lib/librte_hash/meson.build in= dex > 6ab46ae9d768..0977a63fd279 100644 > --- a/lib/librte_hash/meson.build > +++ b/lib/librte_hash/meson.build > @@ -10,3 +10,4 @@ headers =3D files('rte_crc_arm64.h', >=20 > sources =3D files('rte_cuckoo_hash.c', 'rte_fbk_hash.c') deps +=3D ['ri= ng'] > +deps +=3D ['rcu'] > diff --git a/lib/librte_hash/rte_cuckoo_hash.c > b/lib/librte_hash/rte_cuckoo_hash.c > index aad0c965be5e..b9e4d82a0c14 100644 > --- a/lib/librte_hash/rte_cuckoo_hash.c > +++ b/lib/librte_hash/rte_cuckoo_hash.c > @@ -52,6 +52,11 @@ static struct rte_tailq_elem rte_hash_tailq =3D { }; > EAL_REGISTER_TAILQ(rte_hash_tailq) >=20 > +struct __rte_hash_rcu_dq_entry { > + uint32_t key_idx; > + uint32_t ext_bkt_idx; /**< Extended bkt index */ }; > + > struct rte_hash * > rte_hash_find_existing(const char *name) { @@ -210,7 +215,10 @@ > rte_hash_create(const struct rte_hash_parameters *params) >=20 > if (params->extra_flag & > RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY_LF) { > readwrite_concur_lf_support =3D 1; > - /* Enable not freeing internal memory/index on delete */ > + /* Enable not freeing internal memory/index on delete. > + * If internal RCU is enabled, freeing of internal memory/index > + * is done on delete > + */ > no_free_on_del =3D 1; > } >=20 > @@ -505,6 +513,10 @@ rte_hash_free(struct rte_hash *h) >=20 > rte_mcfg_tailq_write_unlock(); >=20 > + /* RCU clean up. */ > + if (h->dq) > + rte_rcu_qsbr_dq_delete(h->dq); > + > if (h->use_local_cache) > rte_free(h->local_free_slots); > if (h->writer_takes_lock) > @@ -607,11 +619,21 @@ void > rte_hash_reset(struct rte_hash *h) > { > uint32_t tot_ring_cnt, i; > + unsigned int pending; >=20 > if (h =3D=3D NULL) > return; >=20 > __hash_rw_writer_lock(h); > + > + /* RCU QSBR clean up. */ > + if (h->dq) { > + /* Reclaim all the resources */ > + rte_rcu_qsbr_dq_reclaim(h->dq, ~0, NULL, &pending, NULL); > + if (pending !=3D 0) > + RTE_LOG(ERR, HASH, "RCU reclaim all resources > failed\n"); > + } > + > memset(h->buckets, 0, h->num_buckets * sizeof(struct > rte_hash_bucket)); > memset(h->key_store, 0, h->key_entry_size * (h->entries + 1)); > *h->tbl_chng_cnt =3D 0; > @@ -952,6 +974,37 @@ rte_hash_cuckoo_make_space_mw(const struct > rte_hash *h, > return -ENOSPC; > } >=20 > +static inline uint32_t > +alloc_slot(const struct rte_hash *h, struct lcore_cache > +*cached_free_slots) { > + unsigned int n_slots; > + uint32_t slot_id; [Yipeng]: Blank line after variable declaration. > + if (h->use_local_cache) { > + /* Try to get a free slot from the local cache */ > + if (cached_free_slots->len =3D=3D 0) { > + /* Need to get another burst of free slots from global > ring */ > + n_slots =3D rte_ring_mc_dequeue_burst_elem(h- > >free_slots, > + cached_free_slots->objs, > + sizeof(uint32_t), > + LCORE_CACHE_SIZE, NULL); > + if (n_slots =3D=3D 0) > + return EMPTY_SLOT; > + > + cached_free_slots->len +=3D n_slots; > + } > + > + /* Get a free slot from the local cache */ > + cached_free_slots->len--; > + slot_id =3D cached_free_slots->objs[cached_free_slots->len]; > + } else { > + if (rte_ring_sc_dequeue_elem(h->free_slots, &slot_id, > + sizeof(uint32_t)) !=3D 0) > + return EMPTY_SLOT; > + } > + > + return slot_id; > +} > + > static inline int32_t > __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key, > hash_sig_t sig, void *data) > @@ -963,7 +1016,6 @@ __rte_hash_add_key_with_hash(const struct > rte_hash *h, const void *key, > uint32_t ext_bkt_id =3D 0; > uint32_t slot_id; > int ret; > - unsigned n_slots; > unsigned lcore_id; > unsigned int i; > struct lcore_cache *cached_free_slots =3D NULL; @@ -1001,28 > +1053,20 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, > const void *key, > if (h->use_local_cache) { > lcore_id =3D rte_lcore_id(); > cached_free_slots =3D &h->local_free_slots[lcore_id]; > - /* Try to get a free slot from the local cache */ > - if (cached_free_slots->len =3D=3D 0) { > - /* Need to get another burst of free slots from global > ring */ > - n_slots =3D rte_ring_mc_dequeue_burst_elem(h- > >free_slots, > - cached_free_slots->objs, > - sizeof(uint32_t), > - LCORE_CACHE_SIZE, NULL); > - if (n_slots =3D=3D 0) { > - return -ENOSPC; > - } > - > - cached_free_slots->len +=3D n_slots; > + } > + slot_id =3D alloc_slot(h, cached_free_slots); > + if (slot_id =3D=3D EMPTY_SLOT) { > + if (h->dq) { > + __hash_rw_writer_lock(h); > + ret =3D rte_rcu_qsbr_dq_reclaim(h->dq, > + h->hash_rcu_cfg->max_reclaim_size, > + NULL, NULL, NULL); > + __hash_rw_writer_unlock(h); > + if (ret =3D=3D 0) > + slot_id =3D alloc_slot(h, cached_free_slots); > } > - > - /* Get a free slot from the local cache */ > - cached_free_slots->len--; > - slot_id =3D cached_free_slots->objs[cached_free_slots->len]; > - } else { > - if (rte_ring_sc_dequeue_elem(h->free_slots, &slot_id, > - sizeof(uint32_t)) !=3D 0) { > + if (slot_id =3D=3D EMPTY_SLOT) > return -ENOSPC; > - } > } >=20 > new_k =3D RTE_PTR_ADD(keys, slot_id * h->key_entry_size); @@ - > 1118,8 +1162,19 @@ __rte_hash_add_key_with_hash(const struct rte_hash > *h, const void *key, > if (rte_ring_sc_dequeue_elem(h->free_ext_bkts, &ext_bkt_id, > sizeof(uint32_t)) !=3D 0 || > ext_bkt_id =3D=3D 0) { > - ret =3D -ENOSPC; > - goto failure; > + if (h->dq) { > + if (rte_rcu_qsbr_dq_reclaim(h->dq, > + h->hash_rcu_cfg->max_reclaim_size, > + NULL, NULL, NULL) =3D=3D 0) { > + rte_ring_sc_dequeue_elem(h- > >free_ext_bkts, > + &ext_bkt_id, > + sizeof(uint32_t)); > + } > + } > + if (ext_bkt_id =3D=3D 0) { > + ret =3D -ENOSPC; > + goto failure; > + } > } >=20 > /* Use the first location of the new bucket */ @@ -1395,12 +1450,12 > @@ rte_hash_lookup_data(const struct rte_hash *h, const void *key, void > **data) > return __rte_hash_lookup_with_hash(h, key, rte_hash_hash(h, key), > data); } >=20 > -static inline void > -remove_entry(const struct rte_hash *h, struct rte_hash_bucket *bkt, > unsigned i) > +static int > +free_slot(const struct rte_hash *h, uint32_t slot_id) > { > unsigned lcore_id, n_slots; > - struct lcore_cache *cached_free_slots; > - > + struct lcore_cache *cached_free_slots =3D NULL; > + /* Return key indexes to free slot ring */ > if (h->use_local_cache) { > lcore_id =3D rte_lcore_id(); > cached_free_slots =3D &h->local_free_slots[lcore_id]; @@ - > 1411,18 +1466,127 @@ remove_entry(const struct rte_hash *h, struct > rte_hash_bucket *bkt, unsigned i) > cached_free_slots->objs, > sizeof(uint32_t), > LCORE_CACHE_SIZE, NULL); > - ERR_IF_TRUE((n_slots =3D=3D 0), > - "%s: could not enqueue free slots in global > ring\n", > - __func__); > + RETURN_IF_TRUE((n_slots =3D=3D 0), -EFAULT); > cached_free_slots->len -=3D n_slots; > } > - /* Put index of new free slot in cache. */ > - cached_free_slots->objs[cached_free_slots->len] =3D > - bkt->key_idx[i]; > - cached_free_slots->len++; > + } > + > + enqueue_slot_back(h, cached_free_slots, slot_id); > + return 0; > +} > + > +static void > +__hash_rcu_qsbr_free_resource(void *p, void *e, unsigned int n) { > + void *key_data =3D NULL; > + int ret; > + struct rte_hash_key *keys, *k; > + struct rte_hash *h =3D (struct rte_hash *)p; > + struct __rte_hash_rcu_dq_entry rcu_dq_entry =3D > + *((struct __rte_hash_rcu_dq_entry *)e); > + > + RTE_SET_USED(n); > + keys =3D h->key_store; > + > + k =3D (struct rte_hash_key *) ((char *)keys + > + rcu_dq_entry.key_idx * h->key_entry_size); > + key_data =3D k->pdata; > + if (h->hash_rcu_cfg->free_key_data_func) > + h->hash_rcu_cfg->free_key_data_func(h->hash_rcu_cfg- > >key_data_ptr, > + key_data); > + > + if (h->ext_table_support && rcu_dq_entry.ext_bkt_idx !=3D > EMPTY_SLOT) > + /* Recycle empty ext bkt to free list. */ > + rte_ring_sp_enqueue_elem(h->free_ext_bkts, > + &rcu_dq_entry.ext_bkt_idx, sizeof(uint32_t)); > + > + /* Return key indexes to free slot ring */ > + ret =3D free_slot(h, rcu_dq_entry.key_idx); > + if (ret < 0) { > + RTE_LOG(ERR, HASH, > + "%s: could not enqueue free slots in global ring\n", > + __func__); > + } > +} > + > +int > +rte_hash_rcu_qsbr_add(struct rte_hash *h, > + struct rte_hash_rcu_config *cfg) > +{ > + struct rte_rcu_qsbr_dq_parameters params =3D {0}; > + char rcu_dq_name[RTE_RCU_QSBR_DQ_NAMESIZE]; > + struct rte_hash_rcu_config *hash_rcu_cfg =3D NULL; > + > + const uint32_t total_entries =3D h->use_local_cache ? > + h->entries + (RTE_MAX_LCORE - 1) * (LCORE_CACHE_SIZE - 1) > + 1 > + : h->entries + 1; > + > + if ((h =3D=3D NULL) || cfg =3D=3D NULL || cfg->v =3D=3D NULL) { > + rte_errno =3D EINVAL; > + return 1; > + } > + > + if (h->hash_rcu_cfg) { > + rte_errno =3D EEXIST; > + return 1; > + } > + > + hash_rcu_cfg =3D rte_zmalloc(NULL, sizeof(struct rte_hash_rcu_config), > 0); > + if (hash_rcu_cfg =3D=3D NULL) { > + RTE_LOG(ERR, HASH, "memory allocation failed\n"); > + return 1; > + } > + > + if (cfg->mode =3D=3D RTE_HASH_QSBR_MODE_SYNC) { > + /* No other things to do. */ > + } else if (cfg->mode =3D=3D RTE_HASH_QSBR_MODE_DQ) { > + /* Init QSBR defer queue. */ > + snprintf(rcu_dq_name, sizeof(rcu_dq_name), > + "HASH_RCU_%s", h->name); > + params.name =3D rcu_dq_name; > + params.size =3D cfg->dq_size; > + if (params.size =3D=3D 0) > + params.size =3D total_entries; > + params.trigger_reclaim_limit =3D cfg->trigger_reclaim_limit; > + if (params.max_reclaim_size =3D=3D 0) > + params.max_reclaim_size =3D > RTE_HASH_RCU_DQ_RECLAIM_MAX; > + params.esize =3D sizeof(struct __rte_hash_rcu_dq_entry); > + params.free_fn =3D __hash_rcu_qsbr_free_resource; > + params.p =3D h; > + params.v =3D cfg->v; > + h->dq =3D rte_rcu_qsbr_dq_create(¶ms); > + if (h->dq =3D=3D NULL) { > + rte_free(hash_rcu_cfg); > + RTE_LOG(ERR, HASH, "HASH defer queue creation > failed\n"); > + return 1; > + } > } else { > - rte_ring_sp_enqueue_elem(h->free_slots, > - &bkt->key_idx[i], sizeof(uint32_t)); > + rte_free(hash_rcu_cfg); > + rte_errno =3D EINVAL; > + return 1; > + } > + > + hash_rcu_cfg->v =3D cfg->v; > + hash_rcu_cfg->mode =3D cfg->mode; > + hash_rcu_cfg->dq_size =3D params.size; > + hash_rcu_cfg->trigger_reclaim_limit =3D params.trigger_reclaim_limit; > + hash_rcu_cfg->max_reclaim_size =3D params.max_reclaim_size; > + hash_rcu_cfg->free_key_data_func =3D cfg->free_key_data_func; > + hash_rcu_cfg->key_data_ptr =3D cfg->key_data_ptr; > + > + h->hash_rcu_cfg =3D hash_rcu_cfg; > + > + return 0; > +} > + > +static inline void > +remove_entry(const struct rte_hash *h, struct rte_hash_bucket *bkt, > +unsigned i) { > + int ret =3D free_slot(h, bkt->key_idx[i]); > + if (ret < 0) { > + RTE_LOG(ERR, HASH, > + "%s: could not enqueue free slots in global ring\n", > + __func__); > } > } >=20 > @@ -1521,6 +1685,8 @@ __rte_hash_del_key_with_hash(const struct > rte_hash *h, const void *key, > int pos; > int32_t ret, i; > uint16_t short_sig; > + uint32_t index =3D EMPTY_SLOT; > + struct __rte_hash_rcu_dq_entry rcu_dq_entry; >=20 > short_sig =3D get_short_sig(sig); > prim_bucket_idx =3D get_prim_bucket_index(h, sig); @@ -1555,10 > +1721,9 @@ __rte_hash_del_key_with_hash(const struct rte_hash *h, const > void *key, >=20 > /* Search last bucket to see if empty to be recycled */ > return_bkt: > - if (!last_bkt) { > - __hash_rw_writer_unlock(h); > - return ret; > - } > + if (!last_bkt) > + goto return_key; > + > while (last_bkt->next) { > prev_bkt =3D last_bkt; > last_bkt =3D last_bkt->next; > @@ -1571,11 +1736,11 @@ __rte_hash_del_key_with_hash(const struct > rte_hash *h, const void *key, > /* found empty bucket and recycle */ > if (i =3D=3D RTE_HASH_BUCKET_ENTRIES) { > prev_bkt->next =3D NULL; > - uint32_t index =3D last_bkt - h->buckets_ext + 1; > + index =3D last_bkt - h->buckets_ext + 1; > /* Recycle the empty bkt if > * no_free_on_del is disabled. > */ > - if (h->no_free_on_del) > + if (h->no_free_on_del) { > /* Store index of an empty ext bkt to be recycled > * on calling rte_hash_del_xxx APIs. > * When lock free read-write concurrency is enabled, > @@ -1583,12 +1748,34 @@ __rte_hash_del_key_with_hash(const struct > rte_hash *h, const void *key, > * immediately (as readers might be using it still). > * Hence freeing of the ext bkt is piggy-backed to > * freeing of the key index. > + * If using external RCU, store this index in an array. > */ > - h->ext_bkt_to_free[ret] =3D index; > - else > + if (h->hash_rcu_cfg =3D=3D NULL) > + h->ext_bkt_to_free[ret] =3D index; [Yipeng]: If using embedded qsbr (not NULL), how did you recycle the ext bk= t? > + } else > rte_ring_sp_enqueue_elem(h->free_ext_bkts, > &index, > sizeof(uint32_t)); > } > + > +return_key: > + /* Using internal RCU QSBR */ > + if (h->hash_rcu_cfg) { > + /* Key index where key is stored, adding the first dummy > index */ > + rcu_dq_entry.key_idx =3D ret + 1; > + rcu_dq_entry.ext_bkt_idx =3D index; > + if (h->dq =3D=3D NULL) { > + /* Wait for quiescent state change if using > + * RTE_HASH_QSBR_MODE_SYNC > + */ > + rte_rcu_qsbr_synchronize(h->hash_rcu_cfg->v, > + RTE_QSBR_THRID_INVALID); > + __hash_rcu_qsbr_free_resource((void > *)((uintptr_t)h), > + &rcu_dq_entry, 1); > + } else if (h->dq) > + /* Push into QSBR FIFO if using > RTE_HASH_QSBR_MODE_DQ */ > + if (rte_rcu_qsbr_dq_enqueue(h->dq, > &rcu_dq_entry) !=3D 0) > + RTE_LOG(ERR, HASH, "Failed to push QSBR > FIFO\n"); > + } > __hash_rw_writer_unlock(h); > return ret; > } > @@ -1637,8 +1824,6 @@ rte_hash_free_key_with_position(const struct > rte_hash *h, >=20 > RETURN_IF_TRUE(((h =3D=3D NULL) || (key_idx =3D=3D EMPTY_SLOT)), -EINVA= L); >=20 > - unsigned int lcore_id, n_slots; > - struct lcore_cache *cached_free_slots; > const uint32_t total_entries =3D h->use_local_cache ? > h->entries + (RTE_MAX_LCORE - 1) * (LCORE_CACHE_SIZE - 1) > + 1 > : h->entries + 1; > @@ -1656,28 +1841,9 @@ rte_hash_free_key_with_position(const struct > rte_hash *h, > } > } >=20 > - if (h->use_local_cache) { > - lcore_id =3D rte_lcore_id(); > - cached_free_slots =3D &h->local_free_slots[lcore_id]; > - /* Cache full, need to free it. */ > - if (cached_free_slots->len =3D=3D LCORE_CACHE_SIZE) { > - /* Need to enqueue the free slots in global ring. */ > - n_slots =3D rte_ring_mp_enqueue_burst_elem(h- > >free_slots, > - cached_free_slots->objs, > - sizeof(uint32_t), > - LCORE_CACHE_SIZE, NULL); > - RETURN_IF_TRUE((n_slots =3D=3D 0), -EFAULT); > - cached_free_slots->len -=3D n_slots; > - } > - /* Put index of new free slot in cache. */ > - cached_free_slots->objs[cached_free_slots->len] =3D key_idx; > - cached_free_slots->len++; > - } else { > - rte_ring_sp_enqueue_elem(h->free_slots, &key_idx, > - sizeof(uint32_t)); > - } > + /* Enqueue slot to cache/ring of free slots. */ > + return free_slot(h, key_idx); >=20 > - return 0; > } >=20 > static inline void > diff --git a/lib/librte_hash/rte_cuckoo_hash.h > b/lib/librte_hash/rte_cuckoo_hash.h > index 345de6bf9cfd..85be49d3bbe7 100644 > --- a/lib/librte_hash/rte_cuckoo_hash.h > +++ b/lib/librte_hash/rte_cuckoo_hash.h > @@ -168,6 +168,11 @@ struct rte_hash { > struct lcore_cache *local_free_slots; > /**< Local cache per lcore, storing some indexes of the free slots */ >=20 > + /* RCU config */ > + struct rte_hash_rcu_config *hash_rcu_cfg; > + /**< HASH RCU QSBR configuration structure */ > + struct rte_rcu_qsbr_dq *dq; /**< RCU QSBR defer queue. */ > + > /* Fields used in lookup */ >=20 > uint32_t key_len __rte_cache_aligned; > @@ -230,4 +235,7 @@ struct queue_node { > int prev_slot; /* Parent(slot) in search path */ > }; >=20 > +/** @internal Default RCU defer queue entries to reclaim in one go. */ > +#define RTE_HASH_RCU_DQ_RECLAIM_MAX 16 > + > #endif > diff --git a/lib/librte_hash/rte_hash.h b/lib/librte_hash/rte_hash.h inde= x > bff40251bc98..3d28f177f14a 100644 > --- a/lib/librte_hash/rte_hash.h > +++ b/lib/librte_hash/rte_hash.h > @@ -15,6 +15,7 @@ > #include >=20 > #include > +#include >=20 > #ifdef __cplusplus > extern "C" { > @@ -45,7 +46,8 @@ extern "C" { > /** Flag to disable freeing of key index on hash delete. > * Refer to rte_hash_del_xxx APIs for more details. > * This is enabled by default when > RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY_LF > - * is enabled. > + * is enabled. However, if internal RCU is enabled, freeing of internal > + * memory/index is done on delete > */ > #define RTE_HASH_EXTRA_FLAGS_NO_FREE_ON_DEL 0x10 >=20 > @@ -67,6 +69,13 @@ typedef uint32_t (*rte_hash_function)(const void > *key, uint32_t key_len, > /** Type of function used to compare the hash key. */ typedef int > (*rte_hash_cmp_eq_t)(const void *key1, const void *key2, size_t key_len); >=20 > +/** > + * Type of function used to free data stored in the key. > + * Required when using internal RCU to allow application to free > +key-data once > + * the key is returned to the the ring of free key-slots. > + */ > +typedef void (*rte_hash_free_key_data)(void *p, void *key_data); > + > /** > * Parameters used when creating the hash table. > */ > @@ -81,6 +90,39 @@ struct rte_hash_parameters { > uint8_t extra_flag; /**< Indicate if additional parameters > are present. */ > }; >=20 > +/** RCU reclamation modes */ > +enum rte_hash_qsbr_mode { > + /** Create defer queue for reclaim. */ > + RTE_HASH_QSBR_MODE_DQ =3D 0, > + /** Use blocking mode reclaim. No defer queue created. */ > + RTE_HASH_QSBR_MODE_SYNC > +}; > + > +/** HASH RCU QSBR configuration structure. */ struct > +rte_hash_rcu_config { > + struct rte_rcu_qsbr *v; /**< RCU QSBR variable. */ > + enum rte_hash_qsbr_mode mode; > + /**< Mode of RCU QSBR. RTE_HASH_QSBR_MODE_xxx > + * '0' for default: create defer queue for reclaim. > + */ > + uint32_t dq_size; > + /**< RCU defer queue size. > + * default: total hash table entries. > + */ > + uint32_t trigger_reclaim_limit; /**< Threshold to trigger auto reclaim. > */ > + uint32_t max_reclaim_size; > + /**< Max entries to reclaim in one go. > + * default: RTE_HASH_RCU_DQ_RECLAIM_MAX. > + */ > + void *key_data_ptr; > + /**< Pointer passed to the free function. Typically, this is the > + * pointer to the data structure to which the resource to free > + * (key-data) belongs. This can be NULL. > + */ > + rte_hash_free_key_data free_key_data_func; > + /**< Function to call to free the resource (key-data). */ }; > + > /** @internal A hash table structure. */ struct rte_hash; >=20 > @@ -287,7 +329,8 @@ rte_hash_add_key_with_hash(const struct rte_hash > *h, const void *key, hash_sig_t > * Thread safety can be enabled by setting flag during > * table creation. > * If RTE_HASH_EXTRA_FLAGS_NO_FREE_ON_DEL or > - * RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY_LF is enabled, > + * RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY_LF is enabled and > + * internal RCU is NOT enabled, > * the key index returned by rte_hash_add_key_xxx APIs will not be > * freed by this API. rte_hash_free_key_with_position API must be called > * additionally to free the index associated with the key. > @@ -316,7 +359,8 @@ rte_hash_del_key(const struct rte_hash *h, const > void *key); > * Thread safety can be enabled by setting flag during > * table creation. > * If RTE_HASH_EXTRA_FLAGS_NO_FREE_ON_DEL or > - * RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY_LF is enabled, > + * RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY_LF is enabled and > + * internal RCU is NOT enabled, > * the key index returned by rte_hash_add_key_xxx APIs will not be > * freed by this API. rte_hash_free_key_with_position API must be called > * additionally to free the index associated with the key. > @@ -370,7 +414,8 @@ rte_hash_get_key_with_position(const struct > rte_hash *h, const int32_t position, > * only be called from one thread by default. Thread safety > * can be enabled by setting flag during table creation. > * If RTE_HASH_EXTRA_FLAGS_NO_FREE_ON_DEL or > - * RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY_LF is enabled, > + * RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY_LF is enabled and > + * internal RCU is NOT enabled, > * the key index returned by rte_hash_del_key_xxx APIs must be freed > * using this API. This API should be called after all the readers > * have stopped referencing the entry corresponding to this key. > @@ -625,6 +670,30 @@ rte_hash_lookup_bulk(const struct rte_hash *h, > const void **keys, > */ > int32_t > rte_hash_iterate(const struct rte_hash *h, const void **key, void **data= , > uint32_t *next); > + > +/** > + * @warning > + * @b EXPERIMENTAL: this API may change without prior notice > + * > + * Associate RCU QSBR variable with an Hash object. [Yipeng]: a Hash object > + * This API should be called to enable the integrated RCU QSBR support > +and > + * should be called immediately after creating the Hash object. > + * > + * @param h > + * the hash object to add RCU QSBR > + * @param cfg > + * RCU QSBR configuration > + * @return > + * On success - 0 > + * On error - 1 with error code set in rte_errno. > + * Possible rte_errno codes are: > + * - EINVAL - invalid pointer > + * - EEXIST - already added QSBR > + * - ENOMEM - memory allocation failure > + */ > +__rte_experimental > +int rte_hash_rcu_qsbr_add(struct rte_hash *h, > + struct rte_hash_rcu_config *cfg); > #ifdef __cplusplus > } > #endif > diff --git a/lib/librte_hash/version.map b/lib/librte_hash/version.map in= dex > c0db81014ff9..c6d73080f478 100644 > --- a/lib/librte_hash/version.map > +++ b/lib/librte_hash/version.map > @@ -36,5 +36,5 @@ EXPERIMENTAL { > rte_hash_lookup_with_hash_bulk; > rte_hash_lookup_with_hash_bulk_data; > rte_hash_max_key_id; > - > + rte_hash_rcu_qsbr_add; > }; > -- > 2.17.1 [Yipeng]: Hi, Dharmik, thanks for the work! It generally looks good. Just some minor issues to address and one question for the ext table inline= d. Thanks!