From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from dpdk.org (dpdk.org [92.243.14.124])
	by inbox.dpdk.org (Postfix) with ESMTP id AC5A9A2F18
	for <public@inbox.dpdk.org>; Thu,  3 Oct 2019 08:29:45 +0200 (CEST)
Received: from [92.243.14.124] (localhost [127.0.0.1])
	by dpdk.org (Postfix) with ESMTP id C88591BFCD;
	Thu,  3 Oct 2019 08:29:44 +0200 (CEST)
Received: from EUR02-AM5-obe.outbound.protection.outlook.com
 (mail-eopbgr00055.outbound.protection.outlook.com [40.107.0.55])
 by dpdk.org (Postfix) with ESMTP id 0630B1BFCC
 for <dev@dpdk.org>; Thu,  3 Oct 2019 08:29:43 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; 
 s=selector2-armh-onmicrosoft-com;
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;
 bh=gWw3KS58zS1gNO6fzD0TU2t2R9KZQuup8bVveucDqq0=;
 b=kYMqCzWcOM+r30FXsFqNdOC4r8iv4l6hQKQ3TCIEPzRfgyAKIk8aLsnSHlEF6gkrmkkGbQ2Fmg7GzJFCoigdCiCplIwzqOkboqSedk7fnVhML7S0sG+vtbXwNm4QCWc6oU5gDBfolSX5LCY1/EmgEk2UreLau2QHNFj7nZ+fEWo=
Received: from DB6PR0802CA0028.eurprd08.prod.outlook.com (2603:10a6:4:a3::14)
 by DB8PR08MB4137.eurprd08.prod.outlook.com (2603:10a6:10:a5::16) with
 Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2305.20; Thu, 3 Oct
 2019 06:29:41 +0000
Received: from DB5EUR03FT009.eop-EUR03.prod.protection.outlook.com
 (2a01:111:f400:7e0a::206) by DB6PR0802CA0028.outlook.office365.com
 (2603:10a6:4:a3::14) with Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.2305.20 via Frontend
 Transport; Thu, 3 Oct 2019 06:29:41 +0000
Authentication-Results: spf=temperror (sender IP is 63.35.35.123)
 smtp.mailfrom=arm.com; dpdk.org; dkim=pass (signature was verified)
 header.d=armh.onmicrosoft.com;dpdk.org; dmarc=none action=none
 header.from=arm.com;
Received-SPF: TempError (protection.outlook.com: error in processing during
 lookup of arm.com: DNS Timeout)
Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by
 DB5EUR03FT009.mail.protection.outlook.com (10.152.20.117) with
 Microsoft SMTP
 Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id
 15.20.2305.15 via Frontend Transport; Thu, 3 Oct 2019 06:29:40 +0000
Received: ("Tessian outbound 927f2cdd66cc:v33");
 Thu, 03 Oct 2019 06:29:36 +0000
X-CR-MTA-TID: 64aa7808
Received: from 6da6d61cc5a2.2 (ip-172-16-0-2.eu-west-1.compute.internal
 [104.47.14.58]) by 64aa7808-outbound-1.mta.getcheckrecipient.com id
 93D7543D-E1DE-4FF1-A69A-5A3D147076BD.1; 
 Thu, 03 Oct 2019 06:29:31 +0000
Received: from EUR04-VI1-obe.outbound.protection.outlook.com
 (mail-vi1eur04lp2058.outbound.protection.outlook.com [104.47.14.58])
 by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 6da6d61cc5a2.2
 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384);
 Thu, 03 Oct 2019 06:29:31 +0000
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none;
 b=WK+v/PqJo8xhacLf8FuSPNrjgsxmfrCGcSuoNqy6ZzBbPmtWXSr2Bg2dfp21FyMu0q0ngZShxOIL9tkhTF8gu8QRapUUmzVyj/jKdQqLNwHVwswybzVwgoEBCg/oKPkhqptthYHzC8gKUBZZjq/29Y5BGKtuPm/ULGJvqWOviuhC0cmFmGqW3mU/TgpRVUAF4Z0vRWYLwr0EnetYsD0jjIQG32d1WE/8Dn/uDlq0V+XLPUmXwayftm7h4qsmqYlCclDw4Azi7fUaKuS/3TdR/JGAZ7yJSlsaNY5To+3xUtLElFrauxGEHJ/KmZOlRWs4jVbwI+0olte08WwGWaVq7Q==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; 
 s=arcselector9901;
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;
 bh=gWw3KS58zS1gNO6fzD0TU2t2R9KZQuup8bVveucDqq0=;
 b=cDsLSlAU+HZc6Wr+jwOnvuPj1iIzCfNeiyeOBtszughjtQdlJ0sFBPWOCdf1QBjRQ+PkloMo1lPBjcxvgZ3S75gsLQgSM5IxKmTbohJqbPhbWbkeTe5k6KIOmNJQNn0WF2RxHA7QxGr1uWETHRWWXazI0KoWhWnkUNIasiwcSi3zQWb+1B6Fbcg9+qQZRw849viaOdIoiz1KaAz9TEBqVFLEGKFRQMWgTydAxA+g0OUkVa/gLgmiyRMXQpFi3uHcT6KQKs2RkJrCYvOkt2xEMrp6vY3xI+scIvaV9SHDO63Dlk3zAcUwEOoOQOiz+vnLVUBfqVOdAqRbYoN/3SH4hg==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass
 smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass
 header.d=arm.com; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; 
 s=selector2-armh-onmicrosoft-com;
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;
 bh=gWw3KS58zS1gNO6fzD0TU2t2R9KZQuup8bVveucDqq0=;
 b=kYMqCzWcOM+r30FXsFqNdOC4r8iv4l6hQKQ3TCIEPzRfgyAKIk8aLsnSHlEF6gkrmkkGbQ2Fmg7GzJFCoigdCiCplIwzqOkboqSedk7fnVhML7S0sG+vtbXwNm4QCWc6oU5gDBfolSX5LCY1/EmgEk2UreLau2QHNFj7nZ+fEWo=
Received: from VE1PR08MB5149.eurprd08.prod.outlook.com (20.179.30.27) by
 VE1PR08MB5279.eurprd08.prod.outlook.com (20.179.29.146) with Microsoft SMTP
 Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id
 15.20.2305.15; Thu, 3 Oct 2019 06:29:29 +0000
Received: from VE1PR08MB5149.eurprd08.prod.outlook.com
 ([fe80::8c82:8d9c:c78d:22a6]) by VE1PR08MB5149.eurprd08.prod.outlook.com
 ([fe80::8c82:8d9c:c78d:22a6%7]) with mapi id 15.20.2305.023; Thu, 3 Oct 2019
 06:29:29 +0000
From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
To: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
 "stephen@networkplumber.org" <stephen@networkplumber.org>,
 "paulmck@linux.ibm.com" <paulmck@linux.ibm.com>
CC: "Wang, Yipeng1" <yipeng1.wang@intel.com>, "Medvedkin, Vladimir"
 <vladimir.medvedkin@intel.com>, "Ruifeng Wang (Arm Technology China)"
 <Ruifeng.Wang@arm.com>, Dharmik Thakkar <Dharmik.Thakkar@arm.com>, Honnappa
 Nagarahalli <Honnappa.Nagarahalli@arm.com>, "dev@dpdk.org" <dev@dpdk.org>, nd
 <nd@arm.com>, nd <nd@arm.com>
Thread-Topic: [PATCH v3 2/3] lib/rcu: add resource reclamation APIs
Thread-Index: AQHVeUhlHoJ4LCRP6EyrG0pYBVGBKKdIZEYQ
Date: Thu, 3 Oct 2019 06:29:28 +0000
Message-ID: <VE1PR08MB514912CDFC19476AF82B197F989F0@VE1PR08MB5149.eurprd08.prod.outlook.com>
References: <20190906094534.36060-1-ruifeng.wang@arm.com>
 <20191001062917.35578-1-honnappa.nagarahalli@arm.com>
 <20191001062917.35578-3-honnappa.nagarahalli@arm.com>
 <2601191342CEEE43887BDE71AB977258019196FF8E@irsmsx105.ger.corp.intel.com>
In-Reply-To: <2601191342CEEE43887BDE71AB977258019196FF8E@irsmsx105.ger.corp.intel.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-ts-tracking-id: 09b724cd-79b5-41fe-8e7f-8b333204cab2.0
x-checkrecipientchecked: true
Authentication-Results-Original: spf=none (sender IP is )
 smtp.mailfrom=Honnappa.Nagarahalli@arm.com; 
x-originating-ip: [217.140.111.135]
x-ms-publictraffictype: Email
X-MS-Office365-Filtering-Correlation-Id: 0473f43b-0992-468a-e359-08d747cb126c
X-MS-Office365-Filtering-HT: Tenant
X-MS-TrafficTypeDiagnostic: VE1PR08MB5279:|VE1PR08MB5279:|DB8PR08MB4137:
x-ms-exchange-transport-forked: True
X-Microsoft-Antispam-PRVS: <DB8PR08MB413716F9B43B6ABBE261C4A5989F0@DB8PR08MB4137.eurprd08.prod.outlook.com>
x-checkrecipientrouted: true
x-ms-oob-tlc-oobclassifiers: OLM:7219;OLM:7219;
x-forefront-prvs: 01792087B6
X-Forefront-Antispam-Report-Untrusted: SFV:NSPM;
 SFS:(10009020)(4636009)(346002)(396003)(39860400002)(366004)(136003)(376002)(52314003)(189003)(199004)(86362001)(30864003)(256004)(2501003)(6246003)(71200400001)(71190400001)(2201001)(5660300002)(55016002)(14444005)(4326008)(6436002)(9686003)(52536014)(229853002)(66446008)(8676002)(64756008)(81156014)(476003)(186003)(81166006)(25786009)(76116006)(76176011)(66946007)(316002)(6506007)(478600001)(66556008)(110136005)(11346002)(446003)(54906003)(6116002)(3846002)(33656002)(8936002)(486006)(66066001)(99286004)(7736002)(74316002)(7696005)(14454004)(305945005)(26005)(102836004)(2906002)(66476007)(579004);
 DIR:OUT; SFP:1101; SCL:1; SRVR:VE1PR08MB5279;
 H:VE1PR08MB5149.eurprd08.prod.outlook.com; FPR:; SPF:None; LANG:en;
 PTR:InfoNoRecords; MX:1; A:1; 
received-spf: None (protection.outlook.com: arm.com does not designate
 permitted sender hosts)
X-MS-Exchange-SenderADCheck: 1
X-Microsoft-Antispam-Untrusted: BCL:0;
X-Microsoft-Antispam-Message-Info-Original: b44Q1njXZSoT2o+GMNeXLMLVVEzvguuxX5OLbajcMBA4+Iy3lrElns9zEOLF6M3DS/22UEn/Wupk3vGPBoxXQd081yZ9J7+q2mHIoetskajkrZ48sson23Acs6ebZthC66Dpt9nJ93IWIxSrJ6oLsRcg1XbUVvhaJlggqLVcKgyKM1aLSpIB9wNAPIhqjekMTtbjElM57ixU0mOhdM2yR650a9X+avaAqJjFC2ZCiQntcHBaarU/+vz7qklO9z9o9hN5TJTHUoF2Xy04tA7YZx2VT3YNBEsuzRuR4AEoHQSguRJKbXCJ5I3x7IyM95FgpAkV6FlJuId3GtkYqkiEdpdL2SpUCvj8JqArPKqHWHo3PDMuON4ggE8V6YZ9S5irTNYIfTTJ4f64vviR++kRDZ1KGaRJxQ7e+862v2LcsOc=
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB5279
Original-Authentication-Results: spf=none (sender IP is )
 smtp.mailfrom=Honnappa.Nagarahalli@arm.com; 
X-EOPAttributedMessage: 0
X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB5EUR03FT009.eop-EUR03.prod.protection.outlook.com
X-Forefront-Antispam-Report: CIP:63.35.35.123; IPV:CAL; SCL:-1; CTRY:IE;
 EFV:NLI; SFV:NSPM;
 SFS:(10009020)(4636009)(376002)(396003)(346002)(39860400002)(136003)(189003)(52314003)(199004)(3846002)(76130400001)(6116002)(11346002)(99286004)(8676002)(229853002)(81166006)(81156014)(7736002)(25786009)(305945005)(74316002)(4326008)(8746002)(8936002)(9686003)(55016002)(6246003)(86362001)(23726003)(14444005)(33656002)(50466002)(46406003)(2906002)(2501003)(97756001)(478600001)(26826003)(14454004)(316002)(446003)(76176011)(22756006)(63350400001)(7696005)(102836004)(356004)(126002)(486006)(6506007)(70206006)(476003)(110136005)(70586007)(52536014)(54906003)(5660300002)(186003)(66066001)(47776003)(30864003)(26005)(2201001)(336012);
 DIR:OUT; SFP:1101; SCL:1; SRVR:DB8PR08MB4137;
 H:64aa7808-outbound-1.mta.getcheckrecipient.com; FPR:; SPF:TempError; LANG:en;
 PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; MX:1; A:1; 
X-MS-Office365-Filtering-Correlation-Id-Prvs: 8051a523-4366-460f-a0e9-08d747cb0b5e
NoDisclaimer: True
X-Forefront-PRVS: 01792087B6
X-Microsoft-Antispam: BCL:0;
X-Microsoft-Antispam-Message-Info: FTKxZ+TcivnDJ1WLeKViK96Sdss5G+Vk4PBfgg2joN7MSNBU2kDfyM7Uj0F+Na5lyxB70ZjECv8tgC5lBHeEFKus6y+WRGzxiPGA/G27dB8VNanaKxG6p53E/eugTkZYn0E0Dmnn5QH9Kg3gnOLc41rHC++wEbSzB6pfvOXuarUwOp9eGREIZ9DyJFLGg/SQgxrEKXjzvH1GHe/F394QY4ZiaCFjCoGvW3zlN1J4CSNV/pX+RIMNqTkukDrsbKGZiKo1RMzbemAYUGkyPlRiYGlVxquz0Xu8pK/3w+u29/x67WsZyuaFoesvsAcAOWwTUb3+6X6N0QR16Ly6uhWXbmlcVYbG1dHyQGDi4wrkVuxVVCUENGCtD1NlrcyIM/3CAvjJx7VW2KNJUYmcih3+Wu2HfNolPCQqHCgZ3OYUz9E=
X-OriginatorOrg: arm.com
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 03 Oct 2019 06:29:40.8553 (UTC)
X-MS-Exchange-CrossTenant-Network-Message-Id: 0473f43b-0992-468a-e359-08d747cb126c
X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d
X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123];
 Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com]
X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem
X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB8PR08MB4137
Subject: Re: [dpdk-dev] [PATCH v3 2/3] lib/rcu: add resource reclamation APIs
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

>=20
> Hi Honnappa,
Thanks Konstantin for the feedback.

>=20
>=20
> > Add resource reclamation APIs to make it simple for applications and
> > libraries to integrate rte_rcu library.
> >
> > Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> > Reviewed-by: Ola Liljedhal <ola.liljedhal@arm.com>
> > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > ---
> >  app/test/test_rcu_qsbr.c           | 291 ++++++++++++++++++++++++++++-
> >  lib/librte_rcu/meson.build         |   2 +
> >  lib/librte_rcu/rte_rcu_qsbr.c      | 185 ++++++++++++++++++
> >  lib/librte_rcu/rte_rcu_qsbr.h      | 169 +++++++++++++++++
> >  lib/librte_rcu/rte_rcu_qsbr_pvt.h  |  46 +++++
> >  lib/librte_rcu/rte_rcu_version.map |   4 +
> >  lib/meson.build                    |   6 +-
> >  7 files changed, 700 insertions(+), 3 deletions(-)  create mode
> > 100644 lib/librte_rcu/rte_rcu_qsbr_pvt.h
> >
> > diff --git a/lib/librte_rcu/rte_rcu_qsbr.c
> > b/lib/librte_rcu/rte_rcu_qsbr.c index ce7f93dd3..76814f50b 100644
> > --- a/lib/librte_rcu/rte_rcu_qsbr.c
> > +++ b/lib/librte_rcu/rte_rcu_qsbr.c
> > @@ -21,6 +21,7 @@
> >  #include <rte_errno.h>
> >
> >  #include "rte_rcu_qsbr.h"
> > +#include "rte_rcu_qsbr_pvt.h"
> >
> >  /* Get the memory size of QSBR variable */  size_t @@ -267,6 +268,190
> > @@ rte_rcu_qsbr_dump(FILE *f, struct rte_rcu_qsbr *v)
> >  	return 0;
> >  }
> >
> > +/* Create a queue used to store the data structure elements that can
> > + * be freed later. This queue is referred to as 'defer queue'.
> > + */
> > +struct rte_rcu_qsbr_dq *
> > +rte_rcu_qsbr_dq_create(const struct rte_rcu_qsbr_dq_parameters
> > +*params) {
> > +	struct rte_rcu_qsbr_dq *dq;
> > +	uint32_t qs_fifo_size;
> > +
> > +	if (params =3D=3D NULL || params->f =3D=3D NULL ||
> > +		params->v =3D=3D NULL || params->name =3D=3D NULL ||
> > +		params->size =3D=3D 0 || params->esize =3D=3D 0 ||
> > +		(params->esize % 8 !=3D 0)) {
> > +		rte_log(RTE_LOG_ERR, rte_rcu_log_type,
> > +			"%s(): Invalid input parameter\n", __func__);
> > +		rte_errno =3D EINVAL;
> > +
> > +		return NULL;
> > +	}
> > +
> > +	dq =3D rte_zmalloc(NULL,
> > +		(sizeof(struct rte_rcu_qsbr_dq) + params->esize),
> > +		RTE_CACHE_LINE_SIZE);
> > +	if (dq =3D=3D NULL) {
> > +		rte_errno =3D ENOMEM;
> > +
> > +		return NULL;
> > +	}
> > +
> > +	/* round up qs_fifo_size to next power of two that is not less than
> > +	 * max_size.
> > +	 */
> > +	qs_fifo_size =3D rte_align32pow2((((params->esize/8) + 1)
> > +					* params->size) + 1);
> > +	dq->r =3D rte_ring_create(params->name, qs_fifo_size,
> > +					SOCKET_ID_ANY, 0);
>=20
> If it is going to be not MT safe, then why not to create the ring with
> (RING_F_SP_ENQ | RING_F_SC_DEQ) flags set?
Agree.

> Though I think it could be changed to allow MT safe multiple enqeue/singl=
e
> dequeue, see below.
The MT safe issue is due to reclaim code. The reclaim code has the followin=
g sequence:

rte_ring_peek
rte_rcu_qsbr_check
rte_ring_dequeue

This entire sequence needs to be atomic as the entry cannot be dequeued wit=
hout knowing that the grace period for that entry is over. Note that due to=
 optimizations in rte_rcu_qsbr_check API, this sequence should not be large=
 in most cases. I do not have ideas on how to make this sequence lock-free.

If the writer is on the control plane, most use cases will use mutex locks =
for synchronization if they are multi-threaded. That lock should be enough =
to provide the thread safety for these APIs.

If the writer is multi-threaded and lock-free, then one should use per thre=
ad defer queue.

>=20
> > +	if (dq->r =3D=3D NULL) {
> > +		rte_log(RTE_LOG_ERR, rte_rcu_log_type,
> > +			"%s(): defer queue create failed\n", __func__);
> > +		rte_free(dq);
> > +		return NULL;
> > +	}
> > +
> > +	dq->v =3D params->v;
> > +	dq->size =3D params->size;
> > +	dq->esize =3D params->esize;
> > +	dq->f =3D params->f;
> > +	dq->p =3D params->p;
> > +
> > +	return dq;
> > +}
> > +
> > +/* Enqueue one resource to the defer queue to free after the grace
> > + * period is over.
> > + */
> > +int rte_rcu_qsbr_dq_enqueue(struct rte_rcu_qsbr_dq *dq, void *e) {
> > +	uint64_t token;
> > +	uint64_t *tmp;
> > +	uint32_t i;
> > +	uint32_t cur_size, free_size;
> > +
> > +	if (dq =3D=3D NULL || e =3D=3D NULL) {
> > +		rte_log(RTE_LOG_ERR, rte_rcu_log_type,
> > +			"%s(): Invalid input parameter\n", __func__);
> > +		rte_errno =3D EINVAL;
> > +
> > +		return 1;
>=20
> Why just not to return -EINVAL straightway?
> I think there is no much point to set rte_errno in that function at all, =
just
> return value should do.
I am trying to keep these consistent with the existing APIs. They return 0 =
or 1 and set the rte_errno.

>=20
> > +	}
> > +
> > +	/* Start the grace period */
> > +	token =3D rte_rcu_qsbr_start(dq->v);
> > +
> > +	/* Reclaim resources if the queue is 1/8th full. This helps
> > +	 * the queue from growing too large and allows time for reader
> > +	 * threads to report their quiescent state.
> > +	 */
> > +	cur_size =3D rte_ring_count(dq->r) / (dq->esize/8 + 1);
>=20
> Probably would be a bit easier if you just store in dq->esize (elt size +=
 token
> size) / 8.
Agree

>=20
> > +	if (cur_size > (dq->size >> RTE_RCU_QSBR_AUTO_RECLAIM_LIMIT)) {
>=20
> Why to make this threshold value hard-coded?
> Why either not to put it into create parameter, or just return a special =
return
> value, to indicate that threshold is reached?
My thinking was to keep the programming interface easy to use. The more the=
 parameters, the more painful it is for the user. IMO, the constants chosen=
 should be good enough for most cases. More advanced users could modify the=
 constants. However, we could make these as part of the parameters, but mak=
e them optional for the user. For ex: if they set them to 0, default values=
 can be used.

> Or even return number of filled/free entroes on success, so caller can de=
cide
> to reclaim or not based on that information on his own?
This means more code on the user side. I think adding these to parameters s=
eems like a better option.

>=20
> > +		rte_log(RTE_LOG_INFO, rte_rcu_log_type,
> > +			"%s(): Triggering reclamation\n", __func__);
> > +		rte_rcu_qsbr_dq_reclaim(dq);
> > +	}
> > +
> > +	/* Check if there is space for atleast for 1 resource */
> > +	free_size =3D rte_ring_free_count(dq->r) / (dq->esize/8 + 1);
> > +	if (!free_size) {
> > +		rte_log(RTE_LOG_ERR, rte_rcu_log_type,
> > +			"%s(): Defer queue is full\n", __func__);
> > +		rte_errno =3D ENOSPC;
> > +		return 1;
> > +	}
> > +
> > +	/* Enqueue the resource */
> > +	rte_ring_sp_enqueue(dq->r, (void *)(uintptr_t)token);
> > +
> > +	/* The resource to enqueue needs to be a multiple of 64b
> > +	 * due to the limitation of the rte_ring implementation.
> > +	 */
> > +	for (i =3D 0, tmp =3D (uint64_t *)e; i < dq->esize/8; i++, tmp++)
> > +		rte_ring_sp_enqueue(dq->r, (void *)(uintptr_t)*tmp);
>=20
>=20
> That whole construction above looks a bit clumsy and error prone...
> I suppose just:
>=20
> const uint32_t nb_elt =3D  dq->elt_size/8 + 1; uint32_t free, n; ...
> n =3D rte_ring_enqueue_bulk(dq->r, e, nb_elt, &free); if (n =3D=3D 0)
Yes, bulk enqueue can be used. But note that once the flexible element size=
 ring patch is done, this code will use that.

>   return -ENOSPC;
> return free;
>=20
> That way I think you can have MT-safe version of that function.
Please see the description of MT safe issue above.

>=20
> > +
> > +	return 0;
> > +}
> > +
> > +/* Reclaim resources from the defer queue. */ int
> > +rte_rcu_qsbr_dq_reclaim(struct rte_rcu_qsbr_dq *dq) {
> > +	uint32_t max_cnt;
> > +	uint32_t cnt;
> > +	void *token;
> > +	uint64_t *tmp;
> > +	uint32_t i;
> > +
> > +	if (dq =3D=3D NULL) {
> > +		rte_log(RTE_LOG_ERR, rte_rcu_log_type,
> > +			"%s(): Invalid input parameter\n", __func__);
> > +		rte_errno =3D EINVAL;
> > +
> > +		return 1;
>=20
> Same story as above - I think rte_errno is excessive in this function.
> Just return value should be enough.
>=20
>=20
> > +	}
> > +
> > +	/* Anything to reclaim? */
> > +	if (rte_ring_count(dq->r) =3D=3D 0)
> > +		return 0;
>=20
> Not sure you need that, see below.
>=20
> > +
> > +	/* Reclaim at the max 1/16th the total number of entries. */
> > +	max_cnt =3D dq->size >> RTE_RCU_QSBR_MAX_RECLAIM_LIMIT;
> > +	max_cnt =3D (max_cnt =3D=3D 0) ? dq->size : max_cnt;
>=20
> Again why not to make max_cnt a configurable at create() parameter?
I think making this as an optional parameter for creating defer queue is a =
better option.

> Or even a parameter for that function?
>=20
> > +	cnt =3D 0;
> > +
> > +	/* Check reader threads quiescent state and reclaim resources */
> > +	while ((cnt < max_cnt) && (rte_ring_peek(dq->r, &token) =3D=3D 0) &&
> > +		(rte_rcu_qsbr_check(dq->v, (uint64_t)((uintptr_t)token), false)
> > +			=3D=3D 1)) {
>=20
>=20
> > +		(void)rte_ring_sc_dequeue(dq->r, &token);
> > +		/* The resource to dequeue needs to be a multiple of 64b
> > +		 * due to the limitation of the rte_ring implementation.
> > +		 */
> > +		for (i =3D 0, tmp =3D (uint64_t *)dq->e; i < dq->esize/8;
> > +			i++, tmp++)
> > +			(void)rte_ring_sc_dequeue(dq->r,
> > +					(void *)(uintptr_t)tmp);
>=20
> Again, no need for such constructs with multiple dequeuer I believe.
> Just:
>=20
> const uint32_t nb_elt =3D  dq->elt_size/8 + 1; uint32_t n; uintptr_t
> elt[nb_elt]; ...
> n =3D rte_ring_dequeue_bulk(dq->r, elt, nb_elt, NULL); if (n !=3D 0) {dq-=
>f(dq->p,
> elt);}
Agree on bulk API use.

>=20
> Seems enough.
> Again in that case you can have enqueue/reclaim running in different thre=
ads
> simultaneously, plus you don't need dq->e at all.
Will check on dq->e

>=20
> > +		dq->f(dq->p, dq->e);
> > +
> > +		cnt++;
> > +	}
> > +
> > +	rte_log(RTE_LOG_INFO, rte_rcu_log_type,
> > +		"%s(): Reclaimed %u resources\n", __func__, cnt);
> > +
> > +	if (cnt =3D=3D 0) {
> > +		/* No resources were reclaimed */
> > +		rte_errno =3D EAGAIN;
> > +		return 1;
> > +	}
> > +
> > +	return 0;
>=20
> I'd suggest to return cnt on success.
I am trying to keep the APIs simple. I do not see much use for 'cnt' as ret=
urn value to the user. It exposes more details which I think are internal t=
o the library.

>=20
> > +}
> > +
> > +/* Delete a defer queue. */
> > +int
> > +rte_rcu_qsbr_dq_delete(struct rte_rcu_qsbr_dq *dq) {
> > +	if (dq =3D=3D NULL) {
> > +		rte_log(RTE_LOG_ERR, rte_rcu_log_type,
> > +			"%s(): Invalid input parameter\n", __func__);
> > +		rte_errno =3D EINVAL;
> > +
> > +		return 1;
> > +	}
> > +
> > +	/* Reclaim all the resources */
> > +	if (rte_rcu_qsbr_dq_reclaim(dq) !=3D 0)
> > +		/* Error number is already set by the reclaim API */
> > +		return 1;
>=20
> How do you know that you have reclaimed everything?
Good point, will come back with a different solution.

>=20
> > +
> > +	rte_ring_free(dq->r);
> > +	rte_free(dq);
> > +
> > +	return 0;
> > +}
> > +
> >  int rte_rcu_log_type;
> >
> >  RTE_INIT(rte_rcu_register)
> > diff --git a/lib/librte_rcu/rte_rcu_qsbr.h
> > b/lib/librte_rcu/rte_rcu_qsbr.h index c80f15c00..185d4b50a 100644
> > --- a/lib/librte_rcu/rte_rcu_qsbr.h
> > +++ b/lib/librte_rcu/rte_rcu_qsbr.h
> > @@ -34,6 +34,7 @@ extern "C" {
> >  #include <rte_lcore.h>
> >  #include <rte_debug.h>
> >  #include <rte_atomic.h>
> > +#include <rte_ring.h>
> >
> >  extern int rte_rcu_log_type;
> >
> > @@ -109,6 +110,67 @@ struct rte_rcu_qsbr {
> >  	 */
> >  } __rte_cache_aligned;
> >
> > +/**
> > + * Call back function called to free the resources.
> > + *
> > + * @param p
> > + *   Pointer provided while creating the defer queue
> > + * @param e
> > + *   Pointer to the resource data stored on the defer queue
> > + *
> > + * @return
> > + *   None
> > + */
> > +typedef void (*rte_rcu_qsbr_free_resource)(void *p, void *e);
>=20
> Stylish thing - usually in DPDK we have typedf newtype_t ...
> Though I am not sure you need a new typedef at all - just a function poin=
ter
> inside the struct seems enough.
Other libraries (for ex: rte_hash) use this approach. I think it is better =
to keep it out of the structure to allow for better commenting.

>=20
> > +
> > +#define RTE_RCU_QSBR_DQ_NAMESIZE RTE_RING_NAMESIZE
> > +
> > +/**
> > + *  Trigger automatic reclamation after 1/8th the defer queue is full.
> > + */
> > +#define RTE_RCU_QSBR_AUTO_RECLAIM_LIMIT 3
> > +
> > +/**
> > + *  Reclaim at the max 1/16th the total number of resources.
> > + */
> > +#define RTE_RCU_QSBR_MAX_RECLAIM_LIMIT 4
>=20
>=20
> As I said above, I don't think these thresholds need to be hardcoded.
> In any case, there seems not much point to put them in the public header =
file.
>=20
> > +
> > +/**
> > + * Parameters used when creating the defer queue.
> > + */
> > +struct rte_rcu_qsbr_dq_parameters {
> > +	const char *name;
> > +	/**< Name of the queue. */
> > +	uint32_t size;
> > +	/**< Number of entries in queue. Typically, this will be
> > +	 *   the same as the maximum number of entries supported in the
> > +	 *   lock free data structure.
> > +	 *   Data structures with unbounded number of entries is not
> > +	 *   supported currently.
> > +	 */
> > +	uint32_t esize;
> > +	/**< Size (in bytes) of each element in the defer queue.
> > +	 *   This has to be multiple of 8B as the rte_ring APIs
> > +	 *   support 8B element sizes only.
> > +	 */
> > +	rte_rcu_qsbr_free_resource f;
> > +	/**< Function to call to free the resource. */
> > +	void *p;
>=20
> Style nit again - I like short names myself, but that seems a bit extreme=
... :)
> Might be at least:
> void (*reclaim)(void *, void *);
May be 'free_fn'?

> void * reclaim_data;
> ?
This is the pointer to the data structure to free the resource into. For ex=
: In LPM data structure, it will be pointer to LPM. 'reclaim_data' does not=
 convey the meaning correctly.

>=20
> > +	/**< Pointer passed to the free function. Typically, this is the
> > +	 *   pointer to the data structure to which the resource to free
> > +	 *   belongs. This can be NULL.
> > +	 */
> > +	struct rte_rcu_qsbr *v;
>=20
> Does it need to be inside that struct?
> Might be better:
> rte_rcu_qsbr_dq_create(struct rte_rcu_qsbr *v, const struct
> rte_rcu_qsbr_dq_parameters *params);
The API takes a parameter structure as input anyway, why to add another arg=
ument to the function? QSBR variable is also another parameter.

>=20
> Another alternative: make both reclaim() and enqueue() to take v as a
> parameter.
But both of them need access to some of the parameters provided in rte_rcu_=
qsbr_dq_create API. We would end up passing 2 arguments to the functions.

>=20
> > +	/**< RCU QSBR variable to use for this defer queue */ };
> > +
> > +/* RTE defer queue structure.
> > + * This structure holds the defer queue. The defer queue is used to
> > + * hold the deleted entries from the data structure that are not
> > + * yet freed.
> > + */
> > +struct rte_rcu_qsbr_dq;
> > +
> >  /**
> >   * @warning
> >   * @b EXPERIMENTAL: this API may change without prior notice @@
> > -648,6 +710,113 @@ __rte_experimental  int  rte_rcu_qsbr_dump(FILE *f,
> > struct rte_rcu_qsbr *v);
> >
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice
> > + *
> > + * Create a queue used to store the data structure elements that can
> > + * be freed later. This queue is referred to as 'defer queue'.
> > + *
> > + * @param params
> > + *   Parameters to create a defer queue.
> > + * @return
> > + *   On success - Valid pointer to defer queue
> > + *   On error - NULL
> > + *   Possible rte_errno codes are:
> > + *   - EINVAL - NULL parameters are passed
> > + *   - ENOMEM - Not enough memory
> > + */
> > +__rte_experimental
> > +struct rte_rcu_qsbr_dq *
> > +rte_rcu_qsbr_dq_create(const struct rte_rcu_qsbr_dq_parameters
> > +*params);
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice
> > + *
> > + * Enqueue one resource to the defer queue and start the grace period.
> > + * The resource will be freed later after at least one grace period
> > + * is over.
> > + *
> > + * If the defer queue is full, it will attempt to reclaim resources.
> > + * It will also reclaim resources at regular intervals to avoid
> > + * the defer queue from growing too big.
> > + *
> > + * This API is not multi-thread safe. It is expected that the caller
> > + * provides multi-thread safety by locking a mutex or some other means=
.
> > + *
> > + * A lock free multi-thread writer algorithm could achieve
> > +multi-thread
> > + * safety by creating and using one defer queue per thread.
> > + *
> > + * @param dq
> > + *   Defer queue to allocate an entry from.
> > + * @param e
> > + *   Pointer to resource data to copy to the defer queue. The size of
> > + *   the data to copy is equal to the element size provided when the
> > + *   defer queue was created.
> > + * @return
> > + *   On success - 0
> > + *   On error - 1 with rte_errno set to
> > + *   - EINVAL - NULL parameters are passed
> > + *   - ENOSPC - Defer queue is full. This condition can not happen
> > + *		if the defer queue size is equal (or larger) than the
> > + *		number of elements in the data structure.
> > + */
> > +__rte_experimental
> > +int
> > +rte_rcu_qsbr_dq_enqueue(struct rte_rcu_qsbr_dq *dq, void *e);
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice
> > + *
> > + * Reclaim resources from the defer queue.
> > + *
> > + * This API is not multi-thread safe. It is expected that the caller
> > + * provides multi-thread safety by locking a mutex or some other means=
.
> > + *
> > + * A lock free multi-thread writer algorithm could achieve
> > +multi-thread
> > + * safety by creating and using one defer queue per thread.
> > + *
> > + * @param dq
> > + *   Defer queue to reclaim an entry from.
> > + * @return
> > + *   On successful reclamation of at least 1 resource - 0
> > + *   On error - 1 with rte_errno set to
> > + *   - EINVAL - NULL parameters are passed
> > + *   - EAGAIN - None of the resources have completed at least 1 grace
> period,
> > + *		try again.
> > + */
> > +__rte_experimental
> > +int
> > +rte_rcu_qsbr_dq_reclaim(struct rte_rcu_qsbr_dq *dq);
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice
> > + *
> > + * Delete a defer queue.
> > + *
> > + * It tries to reclaim all the resources on the defer queue.
> > + * If any of the resources have not completed the grace period
> > + * the reclamation stops and returns immediately. The rest of
> > + * the resources are not reclaimed and the defer queue is not
> > + * freed.
> > + *
> > + * @param dq
> > + *   Defer queue to delete.
> > + * @return
> > + *   On success - 0
> > + *   On error - 1
> > + *   Possible rte_errno codes are:
> > + *   - EINVAL - NULL parameters are passed
> > + *   - EAGAIN - Some of the resources have not completed at least 1 gr=
ace
> > + *		period, try again.
> > + */
> > +__rte_experimental
> > +int
> > +rte_rcu_qsbr_dq_delete(struct rte_rcu_qsbr_dq *dq);
> > +
> >  #ifdef __cplusplus
> >  }
> >  #endif
> > diff --git a/lib/librte_rcu/rte_rcu_qsbr_pvt.h
> > b/lib/librte_rcu/rte_rcu_qsbr_pvt.h
> > new file mode 100644
> > index 000000000..2122bc36a
> > --- /dev/null
> > +++ b/lib/librte_rcu/rte_rcu_qsbr_pvt.h
>=20
> Again style suggestion: as it is not public header - don't use rte_ prefi=
x for
> naming.
> From my perspective - easier to relalize for reader what is public header=
,
> what is not.
Looks like the guidelines are not defined very well. I see one private file=
 with rte_ prefix. I see Stephen not using rte_ prefix. I do not have any p=
reference. But, a consistent approach is required.

>=20
> > @@ -0,0 +1,46 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright (c) 2019 Arm Limited
> > + */
> > +
> > +#ifndef _RTE_RCU_QSBR_PVT_H_
> > +#define _RTE_RCU_QSBR_PVT_H_
> > +
> > +/**
> > + * This file is private to the RCU library. It should not be included
> > + * by the user of this library.
> > + */
> > +
> > +#ifdef __cplusplus
> > +extern "C" {
> > +#endif
> > +
> > +#include "rte_rcu_qsbr.h"
> > +
> > +/* RTE defer queue structure.
> > + * This structure holds the defer queue. The defer queue is used to
> > + * hold the deleted entries from the data structure that are not
> > + * yet freed.
> > + */
> > +struct rte_rcu_qsbr_dq {
> > +	struct rte_rcu_qsbr *v; /**< RCU QSBR variable used by this queue.*/
> > +	struct rte_ring *r;     /**< RCU QSBR defer queue. */
> > +	uint32_t size;
> > +	/**< Number of elements in the defer queue */
> > +	uint32_t esize;
> > +	/**< Size (in bytes) of data stored on the defer queue */
> > +	rte_rcu_qsbr_free_resource f;
> > +	/**< Function to call to free the resource. */
> > +	void *p;
> > +	/**< Pointer passed to the free function. Typically, this is the
> > +	 *   pointer to the data structure to which the resource to free
> > +	 *   belongs.
> > +	 */
> > +	char e[0];
> > +	/**< Temporary storage to copy the defer queue element. */
>=20
> Do you really need 'e' at all?
> Can't it be just temporary stack variable?
Ok, will check.

>=20
> > +};
> > +
> > +#ifdef __cplusplus
> > +}
> > +#endif
> > +
> > +#endif /* _RTE_RCU_QSBR_PVT_H_ */
> > diff --git a/lib/librte_rcu/rte_rcu_version.map
> > b/lib/librte_rcu/rte_rcu_version.map
> > index f8b9ef2ab..dfac88a37 100644
> > --- a/lib/librte_rcu/rte_rcu_version.map
> > +++ b/lib/librte_rcu/rte_rcu_version.map
> > @@ -8,6 +8,10 @@ EXPERIMENTAL {
> >  	rte_rcu_qsbr_synchronize;
> >  	rte_rcu_qsbr_thread_register;
> >  	rte_rcu_qsbr_thread_unregister;
> > +	rte_rcu_qsbr_dq_create;
> > +	rte_rcu_qsbr_dq_enqueue;
> > +	rte_rcu_qsbr_dq_reclaim;
> > +	rte_rcu_qsbr_dq_delete;
> >
> >  	local: *;
> >  };
> > diff --git a/lib/meson.build b/lib/meson.build index
> > e5ff83893..0e1be8407 100644
> > --- a/lib/meson.build
> > +++ b/lib/meson.build
> > @@ -11,7 +11,9 @@
> >  libraries =3D [
> >  	'kvargs', # eal depends on kvargs
> >  	'eal', # everything depends on eal
> > -	'ring', 'mempool', 'mbuf', 'net', 'meter', 'ethdev', 'pci', # core
> > +	'ring',
> > +	'rcu', # rcu depends on ring
> > +	'mempool', 'mbuf', 'net', 'meter', 'ethdev', 'pci', # core
> >  	'cmdline',
> >  	'metrics', # bitrate/latency stats depends on this
> >  	'hash',    # efd depends on this
> > @@ -22,7 +24,7 @@ libraries =3D [
> >  	'gro', 'gso', 'ip_frag', 'jobstats',
> >  	'kni', 'latencystats', 'lpm', 'member',
> >  	'power', 'pdump', 'rawdev',
> > -	'rcu', 'reorder', 'sched', 'security', 'stack', 'vhost',
> > +	'reorder', 'sched', 'security', 'stack', 'vhost',
> >  	# ipsec lib depends on net, crypto and security
> >  	'ipsec',
> >  	# add pkt framework libs which use other libs from above
> > --
> > 2.17.1