From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id AC5A9A2F18 for ; Thu, 3 Oct 2019 08:29:45 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id C88591BFCD; Thu, 3 Oct 2019 08:29:44 +0200 (CEST) Received: from EUR02-AM5-obe.outbound.protection.outlook.com (mail-eopbgr00055.outbound.protection.outlook.com [40.107.0.55]) by dpdk.org (Postfix) with ESMTP id 0630B1BFCC for ; Thu, 3 Oct 2019 08:29:43 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=gWw3KS58zS1gNO6fzD0TU2t2R9KZQuup8bVveucDqq0=; b=kYMqCzWcOM+r30FXsFqNdOC4r8iv4l6hQKQ3TCIEPzRfgyAKIk8aLsnSHlEF6gkrmkkGbQ2Fmg7GzJFCoigdCiCplIwzqOkboqSedk7fnVhML7S0sG+vtbXwNm4QCWc6oU5gDBfolSX5LCY1/EmgEk2UreLau2QHNFj7nZ+fEWo= Received: from DB6PR0802CA0028.eurprd08.prod.outlook.com (2603:10a6:4:a3::14) by DB8PR08MB4137.eurprd08.prod.outlook.com (2603:10a6:10:a5::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2305.20; Thu, 3 Oct 2019 06:29:41 +0000 Received: from DB5EUR03FT009.eop-EUR03.prod.protection.outlook.com (2a01:111:f400:7e0a::206) by DB6PR0802CA0028.outlook.office365.com (2603:10a6:4:a3::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.2305.20 via Frontend Transport; Thu, 3 Oct 2019 06:29:41 +0000 Authentication-Results: spf=temperror (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dpdk.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dpdk.org; dmarc=none action=none header.from=arm.com; Received-SPF: TempError (protection.outlook.com: error in processing during lookup of arm.com: DNS Timeout) Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB5EUR03FT009.mail.protection.outlook.com (10.152.20.117) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.2305.15 via Frontend Transport; Thu, 3 Oct 2019 06:29:40 +0000 Received: ("Tessian outbound 927f2cdd66cc:v33"); Thu, 03 Oct 2019 06:29:36 +0000 X-CR-MTA-TID: 64aa7808 Received: from 6da6d61cc5a2.2 (ip-172-16-0-2.eu-west-1.compute.internal [104.47.14.58]) by 64aa7808-outbound-1.mta.getcheckrecipient.com id 93D7543D-E1DE-4FF1-A69A-5A3D147076BD.1; Thu, 03 Oct 2019 06:29:31 +0000 Received: from EUR04-VI1-obe.outbound.protection.outlook.com (mail-vi1eur04lp2058.outbound.protection.outlook.com [104.47.14.58]) by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 6da6d61cc5a2.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Thu, 03 Oct 2019 06:29:31 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=WK+v/PqJo8xhacLf8FuSPNrjgsxmfrCGcSuoNqy6ZzBbPmtWXSr2Bg2dfp21FyMu0q0ngZShxOIL9tkhTF8gu8QRapUUmzVyj/jKdQqLNwHVwswybzVwgoEBCg/oKPkhqptthYHzC8gKUBZZjq/29Y5BGKtuPm/ULGJvqWOviuhC0cmFmGqW3mU/TgpRVUAF4Z0vRWYLwr0EnetYsD0jjIQG32d1WE/8Dn/uDlq0V+XLPUmXwayftm7h4qsmqYlCclDw4Azi7fUaKuS/3TdR/JGAZ7yJSlsaNY5To+3xUtLElFrauxGEHJ/KmZOlRWs4jVbwI+0olte08WwGWaVq7Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=gWw3KS58zS1gNO6fzD0TU2t2R9KZQuup8bVveucDqq0=; b=cDsLSlAU+HZc6Wr+jwOnvuPj1iIzCfNeiyeOBtszughjtQdlJ0sFBPWOCdf1QBjRQ+PkloMo1lPBjcxvgZ3S75gsLQgSM5IxKmTbohJqbPhbWbkeTe5k6KIOmNJQNn0WF2RxHA7QxGr1uWETHRWWXazI0KoWhWnkUNIasiwcSi3zQWb+1B6Fbcg9+qQZRw849viaOdIoiz1KaAz9TEBqVFLEGKFRQMWgTydAxA+g0OUkVa/gLgmiyRMXQpFi3uHcT6KQKs2RkJrCYvOkt2xEMrp6vY3xI+scIvaV9SHDO63Dlk3zAcUwEOoOQOiz+vnLVUBfqVOdAqRbYoN/3SH4hg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=gWw3KS58zS1gNO6fzD0TU2t2R9KZQuup8bVveucDqq0=; b=kYMqCzWcOM+r30FXsFqNdOC4r8iv4l6hQKQ3TCIEPzRfgyAKIk8aLsnSHlEF6gkrmkkGbQ2Fmg7GzJFCoigdCiCplIwzqOkboqSedk7fnVhML7S0sG+vtbXwNm4QCWc6oU5gDBfolSX5LCY1/EmgEk2UreLau2QHNFj7nZ+fEWo= Received: from VE1PR08MB5149.eurprd08.prod.outlook.com (20.179.30.27) by VE1PR08MB5279.eurprd08.prod.outlook.com (20.179.29.146) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2305.15; Thu, 3 Oct 2019 06:29:29 +0000 Received: from VE1PR08MB5149.eurprd08.prod.outlook.com ([fe80::8c82:8d9c:c78d:22a6]) by VE1PR08MB5149.eurprd08.prod.outlook.com ([fe80::8c82:8d9c:c78d:22a6%7]) with mapi id 15.20.2305.023; Thu, 3 Oct 2019 06:29:29 +0000 From: Honnappa Nagarahalli To: "Ananyev, Konstantin" , "stephen@networkplumber.org" , "paulmck@linux.ibm.com" CC: "Wang, Yipeng1" , "Medvedkin, Vladimir" , "Ruifeng Wang (Arm Technology China)" , Dharmik Thakkar , Honnappa Nagarahalli , "dev@dpdk.org" , nd , nd Thread-Topic: [PATCH v3 2/3] lib/rcu: add resource reclamation APIs Thread-Index: AQHVeUhlHoJ4LCRP6EyrG0pYBVGBKKdIZEYQ Date: Thu, 3 Oct 2019 06:29:28 +0000 Message-ID: References: <20190906094534.36060-1-ruifeng.wang@arm.com> <20191001062917.35578-1-honnappa.nagarahalli@arm.com> <20191001062917.35578-3-honnappa.nagarahalli@arm.com> <2601191342CEEE43887BDE71AB977258019196FF8E@irsmsx105.ger.corp.intel.com> In-Reply-To: <2601191342CEEE43887BDE71AB977258019196FF8E@irsmsx105.ger.corp.intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ts-tracking-id: 09b724cd-79b5-41fe-8e7f-8b333204cab2.0 x-checkrecipientchecked: true Authentication-Results-Original: spf=none (sender IP is ) smtp.mailfrom=Honnappa.Nagarahalli@arm.com; x-originating-ip: [217.140.111.135] x-ms-publictraffictype: Email X-MS-Office365-Filtering-Correlation-Id: 0473f43b-0992-468a-e359-08d747cb126c X-MS-Office365-Filtering-HT: Tenant X-MS-TrafficTypeDiagnostic: VE1PR08MB5279:|VE1PR08MB5279:|DB8PR08MB4137: x-ms-exchange-transport-forked: True X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true x-ms-oob-tlc-oobclassifiers: OLM:7219;OLM:7219; x-forefront-prvs: 01792087B6 X-Forefront-Antispam-Report-Untrusted: SFV:NSPM; SFS:(10009020)(4636009)(346002)(396003)(39860400002)(366004)(136003)(376002)(52314003)(189003)(199004)(86362001)(30864003)(256004)(2501003)(6246003)(71200400001)(71190400001)(2201001)(5660300002)(55016002)(14444005)(4326008)(6436002)(9686003)(52536014)(229853002)(66446008)(8676002)(64756008)(81156014)(476003)(186003)(81166006)(25786009)(76116006)(76176011)(66946007)(316002)(6506007)(478600001)(66556008)(110136005)(11346002)(446003)(54906003)(6116002)(3846002)(33656002)(8936002)(486006)(66066001)(99286004)(7736002)(74316002)(7696005)(14454004)(305945005)(26005)(102836004)(2906002)(66476007)(579004); DIR:OUT; SFP:1101; SCL:1; SRVR:VE1PR08MB5279; H:VE1PR08MB5149.eurprd08.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: b44Q1njXZSoT2o+GMNeXLMLVVEzvguuxX5OLbajcMBA4+Iy3lrElns9zEOLF6M3DS/22UEn/Wupk3vGPBoxXQd081yZ9J7+q2mHIoetskajkrZ48sson23Acs6ebZthC66Dpt9nJ93IWIxSrJ6oLsRcg1XbUVvhaJlggqLVcKgyKM1aLSpIB9wNAPIhqjekMTtbjElM57ixU0mOhdM2yR650a9X+avaAqJjFC2ZCiQntcHBaarU/+vz7qklO9z9o9hN5TJTHUoF2Xy04tA7YZx2VT3YNBEsuzRuR4AEoHQSguRJKbXCJ5I3x7IyM95FgpAkV6FlJuId3GtkYqkiEdpdL2SpUCvj8JqArPKqHWHo3PDMuON4ggE8V6YZ9S5irTNYIfTTJ4f64vviR++kRDZ1KGaRJxQ7e+862v2LcsOc= Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB5279 Original-Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=Honnappa.Nagarahalli@arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB5EUR03FT009.eop-EUR03.prod.protection.outlook.com X-Forefront-Antispam-Report: CIP:63.35.35.123; IPV:CAL; SCL:-1; CTRY:IE; EFV:NLI; SFV:NSPM; SFS:(10009020)(4636009)(376002)(396003)(346002)(39860400002)(136003)(189003)(52314003)(199004)(3846002)(76130400001)(6116002)(11346002)(99286004)(8676002)(229853002)(81166006)(81156014)(7736002)(25786009)(305945005)(74316002)(4326008)(8746002)(8936002)(9686003)(55016002)(6246003)(86362001)(23726003)(14444005)(33656002)(50466002)(46406003)(2906002)(2501003)(97756001)(478600001)(26826003)(14454004)(316002)(446003)(76176011)(22756006)(63350400001)(7696005)(102836004)(356004)(126002)(486006)(6506007)(70206006)(476003)(110136005)(70586007)(52536014)(54906003)(5660300002)(186003)(66066001)(47776003)(30864003)(26005)(2201001)(336012); DIR:OUT; SFP:1101; SCL:1; SRVR:DB8PR08MB4137; H:64aa7808-outbound-1.mta.getcheckrecipient.com; FPR:; SPF:TempError; LANG:en; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; MX:1; A:1; X-MS-Office365-Filtering-Correlation-Id-Prvs: 8051a523-4366-460f-a0e9-08d747cb0b5e NoDisclaimer: True X-Forefront-PRVS: 01792087B6 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: FTKxZ+TcivnDJ1WLeKViK96Sdss5G+Vk4PBfgg2joN7MSNBU2kDfyM7Uj0F+Na5lyxB70ZjECv8tgC5lBHeEFKus6y+WRGzxiPGA/G27dB8VNanaKxG6p53E/eugTkZYn0E0Dmnn5QH9Kg3gnOLc41rHC++wEbSzB6pfvOXuarUwOp9eGREIZ9DyJFLGg/SQgxrEKXjzvH1GHe/F394QY4ZiaCFjCoGvW3zlN1J4CSNV/pX+RIMNqTkukDrsbKGZiKo1RMzbemAYUGkyPlRiYGlVxquz0Xu8pK/3w+u29/x67WsZyuaFoesvsAcAOWwTUb3+6X6N0QR16Ly6uhWXbmlcVYbG1dHyQGDi4wrkVuxVVCUENGCtD1NlrcyIM/3CAvjJx7VW2KNJUYmcih3+Wu2HfNolPCQqHCgZ3OYUz9E= X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 03 Oct 2019 06:29:40.8553 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 0473f43b-0992-468a-e359-08d747cb126c X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB8PR08MB4137 Subject: Re: [dpdk-dev] [PATCH v3 2/3] lib/rcu: add resource reclamation APIs X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" >=20 > Hi Honnappa, Thanks Konstantin for the feedback. >=20 >=20 > > Add resource reclamation APIs to make it simple for applications and > > libraries to integrate rte_rcu library. > > > > Signed-off-by: Honnappa Nagarahalli > > Reviewed-by: Ola Liljedhal > > Reviewed-by: Ruifeng Wang > > --- > > app/test/test_rcu_qsbr.c | 291 ++++++++++++++++++++++++++++- > > lib/librte_rcu/meson.build | 2 + > > lib/librte_rcu/rte_rcu_qsbr.c | 185 ++++++++++++++++++ > > lib/librte_rcu/rte_rcu_qsbr.h | 169 +++++++++++++++++ > > lib/librte_rcu/rte_rcu_qsbr_pvt.h | 46 +++++ > > lib/librte_rcu/rte_rcu_version.map | 4 + > > lib/meson.build | 6 +- > > 7 files changed, 700 insertions(+), 3 deletions(-) create mode > > 100644 lib/librte_rcu/rte_rcu_qsbr_pvt.h > > > > diff --git a/lib/librte_rcu/rte_rcu_qsbr.c > > b/lib/librte_rcu/rte_rcu_qsbr.c index ce7f93dd3..76814f50b 100644 > > --- a/lib/librte_rcu/rte_rcu_qsbr.c > > +++ b/lib/librte_rcu/rte_rcu_qsbr.c > > @@ -21,6 +21,7 @@ > > #include > > > > #include "rte_rcu_qsbr.h" > > +#include "rte_rcu_qsbr_pvt.h" > > > > /* Get the memory size of QSBR variable */ size_t @@ -267,6 +268,190 > > @@ rte_rcu_qsbr_dump(FILE *f, struct rte_rcu_qsbr *v) > > return 0; > > } > > > > +/* Create a queue used to store the data structure elements that can > > + * be freed later. This queue is referred to as 'defer queue'. > > + */ > > +struct rte_rcu_qsbr_dq * > > +rte_rcu_qsbr_dq_create(const struct rte_rcu_qsbr_dq_parameters > > +*params) { > > + struct rte_rcu_qsbr_dq *dq; > > + uint32_t qs_fifo_size; > > + > > + if (params =3D=3D NULL || params->f =3D=3D NULL || > > + params->v =3D=3D NULL || params->name =3D=3D NULL || > > + params->size =3D=3D 0 || params->esize =3D=3D 0 || > > + (params->esize % 8 !=3D 0)) { > > + rte_log(RTE_LOG_ERR, rte_rcu_log_type, > > + "%s(): Invalid input parameter\n", __func__); > > + rte_errno =3D EINVAL; > > + > > + return NULL; > > + } > > + > > + dq =3D rte_zmalloc(NULL, > > + (sizeof(struct rte_rcu_qsbr_dq) + params->esize), > > + RTE_CACHE_LINE_SIZE); > > + if (dq =3D=3D NULL) { > > + rte_errno =3D ENOMEM; > > + > > + return NULL; > > + } > > + > > + /* round up qs_fifo_size to next power of two that is not less than > > + * max_size. > > + */ > > + qs_fifo_size =3D rte_align32pow2((((params->esize/8) + 1) > > + * params->size) + 1); > > + dq->r =3D rte_ring_create(params->name, qs_fifo_size, > > + SOCKET_ID_ANY, 0); >=20 > If it is going to be not MT safe, then why not to create the ring with > (RING_F_SP_ENQ | RING_F_SC_DEQ) flags set? Agree. > Though I think it could be changed to allow MT safe multiple enqeue/singl= e > dequeue, see below. The MT safe issue is due to reclaim code. The reclaim code has the followin= g sequence: rte_ring_peek rte_rcu_qsbr_check rte_ring_dequeue This entire sequence needs to be atomic as the entry cannot be dequeued wit= hout knowing that the grace period for that entry is over. Note that due to= optimizations in rte_rcu_qsbr_check API, this sequence should not be large= in most cases. I do not have ideas on how to make this sequence lock-free. If the writer is on the control plane, most use cases will use mutex locks = for synchronization if they are multi-threaded. That lock should be enough = to provide the thread safety for these APIs. If the writer is multi-threaded and lock-free, then one should use per thre= ad defer queue. >=20 > > + if (dq->r =3D=3D NULL) { > > + rte_log(RTE_LOG_ERR, rte_rcu_log_type, > > + "%s(): defer queue create failed\n", __func__); > > + rte_free(dq); > > + return NULL; > > + } > > + > > + dq->v =3D params->v; > > + dq->size =3D params->size; > > + dq->esize =3D params->esize; > > + dq->f =3D params->f; > > + dq->p =3D params->p; > > + > > + return dq; > > +} > > + > > +/* Enqueue one resource to the defer queue to free after the grace > > + * period is over. > > + */ > > +int rte_rcu_qsbr_dq_enqueue(struct rte_rcu_qsbr_dq *dq, void *e) { > > + uint64_t token; > > + uint64_t *tmp; > > + uint32_t i; > > + uint32_t cur_size, free_size; > > + > > + if (dq =3D=3D NULL || e =3D=3D NULL) { > > + rte_log(RTE_LOG_ERR, rte_rcu_log_type, > > + "%s(): Invalid input parameter\n", __func__); > > + rte_errno =3D EINVAL; > > + > > + return 1; >=20 > Why just not to return -EINVAL straightway? > I think there is no much point to set rte_errno in that function at all, = just > return value should do. I am trying to keep these consistent with the existing APIs. They return 0 = or 1 and set the rte_errno. >=20 > > + } > > + > > + /* Start the grace period */ > > + token =3D rte_rcu_qsbr_start(dq->v); > > + > > + /* Reclaim resources if the queue is 1/8th full. This helps > > + * the queue from growing too large and allows time for reader > > + * threads to report their quiescent state. > > + */ > > + cur_size =3D rte_ring_count(dq->r) / (dq->esize/8 + 1); >=20 > Probably would be a bit easier if you just store in dq->esize (elt size += token > size) / 8. Agree >=20 > > + if (cur_size > (dq->size >> RTE_RCU_QSBR_AUTO_RECLAIM_LIMIT)) { >=20 > Why to make this threshold value hard-coded? > Why either not to put it into create parameter, or just return a special = return > value, to indicate that threshold is reached? My thinking was to keep the programming interface easy to use. The more the= parameters, the more painful it is for the user. IMO, the constants chosen= should be good enough for most cases. More advanced users could modify the= constants. However, we could make these as part of the parameters, but mak= e them optional for the user. For ex: if they set them to 0, default values= can be used. > Or even return number of filled/free entroes on success, so caller can de= cide > to reclaim or not based on that information on his own? This means more code on the user side. I think adding these to parameters s= eems like a better option. >=20 > > + rte_log(RTE_LOG_INFO, rte_rcu_log_type, > > + "%s(): Triggering reclamation\n", __func__); > > + rte_rcu_qsbr_dq_reclaim(dq); > > + } > > + > > + /* Check if there is space for atleast for 1 resource */ > > + free_size =3D rte_ring_free_count(dq->r) / (dq->esize/8 + 1); > > + if (!free_size) { > > + rte_log(RTE_LOG_ERR, rte_rcu_log_type, > > + "%s(): Defer queue is full\n", __func__); > > + rte_errno =3D ENOSPC; > > + return 1; > > + } > > + > > + /* Enqueue the resource */ > > + rte_ring_sp_enqueue(dq->r, (void *)(uintptr_t)token); > > + > > + /* The resource to enqueue needs to be a multiple of 64b > > + * due to the limitation of the rte_ring implementation. > > + */ > > + for (i =3D 0, tmp =3D (uint64_t *)e; i < dq->esize/8; i++, tmp++) > > + rte_ring_sp_enqueue(dq->r, (void *)(uintptr_t)*tmp); >=20 >=20 > That whole construction above looks a bit clumsy and error prone... > I suppose just: >=20 > const uint32_t nb_elt =3D dq->elt_size/8 + 1; uint32_t free, n; ... > n =3D rte_ring_enqueue_bulk(dq->r, e, nb_elt, &free); if (n =3D=3D 0) Yes, bulk enqueue can be used. But note that once the flexible element size= ring patch is done, this code will use that. > return -ENOSPC; > return free; >=20 > That way I think you can have MT-safe version of that function. Please see the description of MT safe issue above. >=20 > > + > > + return 0; > > +} > > + > > +/* Reclaim resources from the defer queue. */ int > > +rte_rcu_qsbr_dq_reclaim(struct rte_rcu_qsbr_dq *dq) { > > + uint32_t max_cnt; > > + uint32_t cnt; > > + void *token; > > + uint64_t *tmp; > > + uint32_t i; > > + > > + if (dq =3D=3D NULL) { > > + rte_log(RTE_LOG_ERR, rte_rcu_log_type, > > + "%s(): Invalid input parameter\n", __func__); > > + rte_errno =3D EINVAL; > > + > > + return 1; >=20 > Same story as above - I think rte_errno is excessive in this function. > Just return value should be enough. >=20 >=20 > > + } > > + > > + /* Anything to reclaim? */ > > + if (rte_ring_count(dq->r) =3D=3D 0) > > + return 0; >=20 > Not sure you need that, see below. >=20 > > + > > + /* Reclaim at the max 1/16th the total number of entries. */ > > + max_cnt =3D dq->size >> RTE_RCU_QSBR_MAX_RECLAIM_LIMIT; > > + max_cnt =3D (max_cnt =3D=3D 0) ? dq->size : max_cnt; >=20 > Again why not to make max_cnt a configurable at create() parameter? I think making this as an optional parameter for creating defer queue is a = better option. > Or even a parameter for that function? >=20 > > + cnt =3D 0; > > + > > + /* Check reader threads quiescent state and reclaim resources */ > > + while ((cnt < max_cnt) && (rte_ring_peek(dq->r, &token) =3D=3D 0) && > > + (rte_rcu_qsbr_check(dq->v, (uint64_t)((uintptr_t)token), false) > > + =3D=3D 1)) { >=20 >=20 > > + (void)rte_ring_sc_dequeue(dq->r, &token); > > + /* The resource to dequeue needs to be a multiple of 64b > > + * due to the limitation of the rte_ring implementation. > > + */ > > + for (i =3D 0, tmp =3D (uint64_t *)dq->e; i < dq->esize/8; > > + i++, tmp++) > > + (void)rte_ring_sc_dequeue(dq->r, > > + (void *)(uintptr_t)tmp); >=20 > Again, no need for such constructs with multiple dequeuer I believe. > Just: >=20 > const uint32_t nb_elt =3D dq->elt_size/8 + 1; uint32_t n; uintptr_t > elt[nb_elt]; ... > n =3D rte_ring_dequeue_bulk(dq->r, elt, nb_elt, NULL); if (n !=3D 0) {dq-= >f(dq->p, > elt);} Agree on bulk API use. >=20 > Seems enough. > Again in that case you can have enqueue/reclaim running in different thre= ads > simultaneously, plus you don't need dq->e at all. Will check on dq->e >=20 > > + dq->f(dq->p, dq->e); > > + > > + cnt++; > > + } > > + > > + rte_log(RTE_LOG_INFO, rte_rcu_log_type, > > + "%s(): Reclaimed %u resources\n", __func__, cnt); > > + > > + if (cnt =3D=3D 0) { > > + /* No resources were reclaimed */ > > + rte_errno =3D EAGAIN; > > + return 1; > > + } > > + > > + return 0; >=20 > I'd suggest to return cnt on success. I am trying to keep the APIs simple. I do not see much use for 'cnt' as ret= urn value to the user. It exposes more details which I think are internal t= o the library. >=20 > > +} > > + > > +/* Delete a defer queue. */ > > +int > > +rte_rcu_qsbr_dq_delete(struct rte_rcu_qsbr_dq *dq) { > > + if (dq =3D=3D NULL) { > > + rte_log(RTE_LOG_ERR, rte_rcu_log_type, > > + "%s(): Invalid input parameter\n", __func__); > > + rte_errno =3D EINVAL; > > + > > + return 1; > > + } > > + > > + /* Reclaim all the resources */ > > + if (rte_rcu_qsbr_dq_reclaim(dq) !=3D 0) > > + /* Error number is already set by the reclaim API */ > > + return 1; >=20 > How do you know that you have reclaimed everything? Good point, will come back with a different solution. >=20 > > + > > + rte_ring_free(dq->r); > > + rte_free(dq); > > + > > + return 0; > > +} > > + > > int rte_rcu_log_type; > > > > RTE_INIT(rte_rcu_register) > > diff --git a/lib/librte_rcu/rte_rcu_qsbr.h > > b/lib/librte_rcu/rte_rcu_qsbr.h index c80f15c00..185d4b50a 100644 > > --- a/lib/librte_rcu/rte_rcu_qsbr.h > > +++ b/lib/librte_rcu/rte_rcu_qsbr.h > > @@ -34,6 +34,7 @@ extern "C" { > > #include > > #include > > #include > > +#include > > > > extern int rte_rcu_log_type; > > > > @@ -109,6 +110,67 @@ struct rte_rcu_qsbr { > > */ > > } __rte_cache_aligned; > > > > +/** > > + * Call back function called to free the resources. > > + * > > + * @param p > > + * Pointer provided while creating the defer queue > > + * @param e > > + * Pointer to the resource data stored on the defer queue > > + * > > + * @return > > + * None > > + */ > > +typedef void (*rte_rcu_qsbr_free_resource)(void *p, void *e); >=20 > Stylish thing - usually in DPDK we have typedf newtype_t ... > Though I am not sure you need a new typedef at all - just a function poin= ter > inside the struct seems enough. Other libraries (for ex: rte_hash) use this approach. I think it is better = to keep it out of the structure to allow for better commenting. >=20 > > + > > +#define RTE_RCU_QSBR_DQ_NAMESIZE RTE_RING_NAMESIZE > > + > > +/** > > + * Trigger automatic reclamation after 1/8th the defer queue is full. > > + */ > > +#define RTE_RCU_QSBR_AUTO_RECLAIM_LIMIT 3 > > + > > +/** > > + * Reclaim at the max 1/16th the total number of resources. > > + */ > > +#define RTE_RCU_QSBR_MAX_RECLAIM_LIMIT 4 >=20 >=20 > As I said above, I don't think these thresholds need to be hardcoded. > In any case, there seems not much point to put them in the public header = file. >=20 > > + > > +/** > > + * Parameters used when creating the defer queue. > > + */ > > +struct rte_rcu_qsbr_dq_parameters { > > + const char *name; > > + /**< Name of the queue. */ > > + uint32_t size; > > + /**< Number of entries in queue. Typically, this will be > > + * the same as the maximum number of entries supported in the > > + * lock free data structure. > > + * Data structures with unbounded number of entries is not > > + * supported currently. > > + */ > > + uint32_t esize; > > + /**< Size (in bytes) of each element in the defer queue. > > + * This has to be multiple of 8B as the rte_ring APIs > > + * support 8B element sizes only. > > + */ > > + rte_rcu_qsbr_free_resource f; > > + /**< Function to call to free the resource. */ > > + void *p; >=20 > Style nit again - I like short names myself, but that seems a bit extreme= ... :) > Might be at least: > void (*reclaim)(void *, void *); May be 'free_fn'? > void * reclaim_data; > ? This is the pointer to the data structure to free the resource into. For ex= : In LPM data structure, it will be pointer to LPM. 'reclaim_data' does not= convey the meaning correctly. >=20 > > + /**< Pointer passed to the free function. Typically, this is the > > + * pointer to the data structure to which the resource to free > > + * belongs. This can be NULL. > > + */ > > + struct rte_rcu_qsbr *v; >=20 > Does it need to be inside that struct? > Might be better: > rte_rcu_qsbr_dq_create(struct rte_rcu_qsbr *v, const struct > rte_rcu_qsbr_dq_parameters *params); The API takes a parameter structure as input anyway, why to add another arg= ument to the function? QSBR variable is also another parameter. >=20 > Another alternative: make both reclaim() and enqueue() to take v as a > parameter. But both of them need access to some of the parameters provided in rte_rcu_= qsbr_dq_create API. We would end up passing 2 arguments to the functions. >=20 > > + /**< RCU QSBR variable to use for this defer queue */ }; > > + > > +/* RTE defer queue structure. > > + * This structure holds the defer queue. The defer queue is used to > > + * hold the deleted entries from the data structure that are not > > + * yet freed. > > + */ > > +struct rte_rcu_qsbr_dq; > > + > > /** > > * @warning > > * @b EXPERIMENTAL: this API may change without prior notice @@ > > -648,6 +710,113 @@ __rte_experimental int rte_rcu_qsbr_dump(FILE *f, > > struct rte_rcu_qsbr *v); > > > > +/** > > + * @warning > > + * @b EXPERIMENTAL: this API may change without prior notice > > + * > > + * Create a queue used to store the data structure elements that can > > + * be freed later. This queue is referred to as 'defer queue'. > > + * > > + * @param params > > + * Parameters to create a defer queue. > > + * @return > > + * On success - Valid pointer to defer queue > > + * On error - NULL > > + * Possible rte_errno codes are: > > + * - EINVAL - NULL parameters are passed > > + * - ENOMEM - Not enough memory > > + */ > > +__rte_experimental > > +struct rte_rcu_qsbr_dq * > > +rte_rcu_qsbr_dq_create(const struct rte_rcu_qsbr_dq_parameters > > +*params); > > + > > +/** > > + * @warning > > + * @b EXPERIMENTAL: this API may change without prior notice > > + * > > + * Enqueue one resource to the defer queue and start the grace period. > > + * The resource will be freed later after at least one grace period > > + * is over. > > + * > > + * If the defer queue is full, it will attempt to reclaim resources. > > + * It will also reclaim resources at regular intervals to avoid > > + * the defer queue from growing too big. > > + * > > + * This API is not multi-thread safe. It is expected that the caller > > + * provides multi-thread safety by locking a mutex or some other means= . > > + * > > + * A lock free multi-thread writer algorithm could achieve > > +multi-thread > > + * safety by creating and using one defer queue per thread. > > + * > > + * @param dq > > + * Defer queue to allocate an entry from. > > + * @param e > > + * Pointer to resource data to copy to the defer queue. The size of > > + * the data to copy is equal to the element size provided when the > > + * defer queue was created. > > + * @return > > + * On success - 0 > > + * On error - 1 with rte_errno set to > > + * - EINVAL - NULL parameters are passed > > + * - ENOSPC - Defer queue is full. This condition can not happen > > + * if the defer queue size is equal (or larger) than the > > + * number of elements in the data structure. > > + */ > > +__rte_experimental > > +int > > +rte_rcu_qsbr_dq_enqueue(struct rte_rcu_qsbr_dq *dq, void *e); > > + > > +/** > > + * @warning > > + * @b EXPERIMENTAL: this API may change without prior notice > > + * > > + * Reclaim resources from the defer queue. > > + * > > + * This API is not multi-thread safe. It is expected that the caller > > + * provides multi-thread safety by locking a mutex or some other means= . > > + * > > + * A lock free multi-thread writer algorithm could achieve > > +multi-thread > > + * safety by creating and using one defer queue per thread. > > + * > > + * @param dq > > + * Defer queue to reclaim an entry from. > > + * @return > > + * On successful reclamation of at least 1 resource - 0 > > + * On error - 1 with rte_errno set to > > + * - EINVAL - NULL parameters are passed > > + * - EAGAIN - None of the resources have completed at least 1 grace > period, > > + * try again. > > + */ > > +__rte_experimental > > +int > > +rte_rcu_qsbr_dq_reclaim(struct rte_rcu_qsbr_dq *dq); > > + > > +/** > > + * @warning > > + * @b EXPERIMENTAL: this API may change without prior notice > > + * > > + * Delete a defer queue. > > + * > > + * It tries to reclaim all the resources on the defer queue. > > + * If any of the resources have not completed the grace period > > + * the reclamation stops and returns immediately. The rest of > > + * the resources are not reclaimed and the defer queue is not > > + * freed. > > + * > > + * @param dq > > + * Defer queue to delete. > > + * @return > > + * On success - 0 > > + * On error - 1 > > + * Possible rte_errno codes are: > > + * - EINVAL - NULL parameters are passed > > + * - EAGAIN - Some of the resources have not completed at least 1 gr= ace > > + * period, try again. > > + */ > > +__rte_experimental > > +int > > +rte_rcu_qsbr_dq_delete(struct rte_rcu_qsbr_dq *dq); > > + > > #ifdef __cplusplus > > } > > #endif > > diff --git a/lib/librte_rcu/rte_rcu_qsbr_pvt.h > > b/lib/librte_rcu/rte_rcu_qsbr_pvt.h > > new file mode 100644 > > index 000000000..2122bc36a > > --- /dev/null > > +++ b/lib/librte_rcu/rte_rcu_qsbr_pvt.h >=20 > Again style suggestion: as it is not public header - don't use rte_ prefi= x for > naming. > From my perspective - easier to relalize for reader what is public header= , > what is not. Looks like the guidelines are not defined very well. I see one private file= with rte_ prefix. I see Stephen not using rte_ prefix. I do not have any p= reference. But, a consistent approach is required. >=20 > > @@ -0,0 +1,46 @@ > > +/* SPDX-License-Identifier: BSD-3-Clause > > + * Copyright (c) 2019 Arm Limited > > + */ > > + > > +#ifndef _RTE_RCU_QSBR_PVT_H_ > > +#define _RTE_RCU_QSBR_PVT_H_ > > + > > +/** > > + * This file is private to the RCU library. It should not be included > > + * by the user of this library. > > + */ > > + > > +#ifdef __cplusplus > > +extern "C" { > > +#endif > > + > > +#include "rte_rcu_qsbr.h" > > + > > +/* RTE defer queue structure. > > + * This structure holds the defer queue. The defer queue is used to > > + * hold the deleted entries from the data structure that are not > > + * yet freed. > > + */ > > +struct rte_rcu_qsbr_dq { > > + struct rte_rcu_qsbr *v; /**< RCU QSBR variable used by this queue.*/ > > + struct rte_ring *r; /**< RCU QSBR defer queue. */ > > + uint32_t size; > > + /**< Number of elements in the defer queue */ > > + uint32_t esize; > > + /**< Size (in bytes) of data stored on the defer queue */ > > + rte_rcu_qsbr_free_resource f; > > + /**< Function to call to free the resource. */ > > + void *p; > > + /**< Pointer passed to the free function. Typically, this is the > > + * pointer to the data structure to which the resource to free > > + * belongs. > > + */ > > + char e[0]; > > + /**< Temporary storage to copy the defer queue element. */ >=20 > Do you really need 'e' at all? > Can't it be just temporary stack variable? Ok, will check. >=20 > > +}; > > + > > +#ifdef __cplusplus > > +} > > +#endif > > + > > +#endif /* _RTE_RCU_QSBR_PVT_H_ */ > > diff --git a/lib/librte_rcu/rte_rcu_version.map > > b/lib/librte_rcu/rte_rcu_version.map > > index f8b9ef2ab..dfac88a37 100644 > > --- a/lib/librte_rcu/rte_rcu_version.map > > +++ b/lib/librte_rcu/rte_rcu_version.map > > @@ -8,6 +8,10 @@ EXPERIMENTAL { > > rte_rcu_qsbr_synchronize; > > rte_rcu_qsbr_thread_register; > > rte_rcu_qsbr_thread_unregister; > > + rte_rcu_qsbr_dq_create; > > + rte_rcu_qsbr_dq_enqueue; > > + rte_rcu_qsbr_dq_reclaim; > > + rte_rcu_qsbr_dq_delete; > > > > local: *; > > }; > > diff --git a/lib/meson.build b/lib/meson.build index > > e5ff83893..0e1be8407 100644 > > --- a/lib/meson.build > > +++ b/lib/meson.build > > @@ -11,7 +11,9 @@ > > libraries =3D [ > > 'kvargs', # eal depends on kvargs > > 'eal', # everything depends on eal > > - 'ring', 'mempool', 'mbuf', 'net', 'meter', 'ethdev', 'pci', # core > > + 'ring', > > + 'rcu', # rcu depends on ring > > + 'mempool', 'mbuf', 'net', 'meter', 'ethdev', 'pci', # core > > 'cmdline', > > 'metrics', # bitrate/latency stats depends on this > > 'hash', # efd depends on this > > @@ -22,7 +24,7 @@ libraries =3D [ > > 'gro', 'gso', 'ip_frag', 'jobstats', > > 'kni', 'latencystats', 'lpm', 'member', > > 'power', 'pdump', 'rawdev', > > - 'rcu', 'reorder', 'sched', 'security', 'stack', 'vhost', > > + 'reorder', 'sched', 'security', 'stack', 'vhost', > > # ipsec lib depends on net, crypto and security > > 'ipsec', > > # add pkt framework libs which use other libs from above > > -- > > 2.17.1