From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by dpdk.space (Postfix) with ESMTP id 68428A0AC5 for ; Wed, 1 May 2019 23:18:11 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 8BC15137C; Wed, 1 May 2019 23:18:10 +0200 (CEST) Received: from EUR01-VE1-obe.outbound.protection.outlook.com (mail-eopbgr140050.outbound.protection.outlook.com [40.107.14.50]) by dpdk.org (Postfix) with ESMTP id 231C41041 for ; Wed, 1 May 2019 23:18:08 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector1-arm-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=pmaww8u+Ap+NPOQfXvcm8bPGzTHdqTO3ICkgqDOnWBg=; b=FwzJkLW3M1NIYQVdVkjicEso4+ykyuUqGA3OsvZhG+rf+Q7hD9WTgVCg29ETgNLnhBVsjHAIaDkciufu3o/evpyowIAcW4A1iS31kOX3mD/Z7qNZ8fs228mqxSkbf03u7d3Cq/O7jd1Edv8eiZaRfFwKX2UzcO1cdHP6fyTUiZA= Received: from VE1PR08MB5149.eurprd08.prod.outlook.com (20.179.30.152) by VE1PR08MB4847.eurprd08.prod.outlook.com (10.255.113.87) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1835.15; Wed, 1 May 2019 21:18:06 +0000 Received: from VE1PR08MB5149.eurprd08.prod.outlook.com ([fe80::9b6:3403:4386:78]) by VE1PR08MB5149.eurprd08.prod.outlook.com ([fe80::9b6:3403:4386:78%2]) with mapi id 15.20.1835.018; Wed, 1 May 2019 21:18:06 +0000 From: Honnappa Nagarahalli To: Neil Horman CC: "konstantin.ananyev@intel.com" , "stephen@networkplumber.org" , "paulmck@linux.ibm.com" , "marko.kovacevic@intel.com" , "dev@dpdk.org" , "Gavin Hu (Arm Technology China)" , Dharmik Thakkar , Malvika Gupta , Honnappa Nagarahalli , nd , nd Thread-Topic: [dpdk-dev] [PATCH v9 0/4] lib/rcu: add RCU library supporting QSBR mechanism Thread-Index: AQHVABe82XMyTZzopkuFSkjHfNlwJqZWQCDQgABQlgCAABhQwA== Date: Wed, 1 May 2019 21:18:05 +0000 Message-ID: References: <20181122033055.3431-1-honnappa.nagarahalli@arm.com> <20190501035419.33524-1-honnappa.nagarahalli@arm.com> <20190501121545.GA26521@hmswarspite.think-freely.org> <20190501180524.GC26521@hmswarspite.think-freely.org> In-Reply-To: <20190501180524.GC26521@hmswarspite.think-freely.org> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Honnappa.Nagarahalli@arm.com; x-originating-ip: [217.140.111.135] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 8f088416-fd07-41f5-1a2b-08d6ce7a80bf x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(7020095)(4652040)(8989299)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(5600141)(711020)(4605104)(4618075)(2017052603328)(7193020); SRVR:VE1PR08MB4847; x-ms-traffictypediagnostic: VE1PR08MB4847: x-ms-exchange-purlcount: 4 nodisclaimer: True x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:10000; x-forefront-prvs: 00246AB517 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(136003)(39860400002)(366004)(346002)(396003)(376002)(199004)(189003)(229853002)(66066001)(186003)(478600001)(256004)(14454004)(71190400001)(71200400001)(26005)(14444005)(446003)(476003)(11346002)(486006)(966005)(5660300002)(4326008)(6506007)(76176011)(7696005)(66476007)(2906002)(72206003)(6436002)(81156014)(99286004)(102836004)(25786009)(6246003)(305945005)(7736002)(54906003)(52536014)(9686003)(66446008)(64756008)(86362001)(73956011)(66556008)(55016002)(76116006)(53936002)(81166006)(316002)(33656002)(66946007)(6116002)(74316002)(8936002)(8676002)(6306002)(68736007)(3846002)(6916009); DIR:OUT; SFP:1101; SCL:1; SRVR:VE1PR08MB4847; H:VE1PR08MB5149.eurprd08.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 x-microsoft-antispam-message-info: TRK5wkyOkHEb2Qdyl0T3GNPyXevkAJYwYSaFt7qzO+HL51szgA7FSB/EW6+WjUdlvLQsQBZQduzLMN9tmS4zLMfk1SlDKfWeqRUbkCzaafvrJUfQykk08HVK9iWLEp4KevMWKAiNd7+if3+bzVrlQR6VDaoDqCv2mJZjYvMe5+GXucpwKiOxU9UkJub2xsMaT/bRanLE9EKH6WBVR3pZh8O44hfG3QpCmzWHrtXehsmmBt/1WuaW1ff0pp1I+rcgYJVxLG+pU2E1eXuQUstoh/k+0X7X8abtKPRmEJn2bJjvUcFhAFjrktSxNbAjPKv8ioAVV1ubfjfmWapFFtdRXf7lIbOVdilTRK73HiRQdZ2T22P3izLlBBYZOrX6EHlVlM0n33XkgQDTNMpV1ptVenteP++FdpRIgF9cGoHa4MM= Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-Network-Message-Id: 8f088416-fd07-41f5-1a2b-08d6ce7a80bf X-MS-Exchange-CrossTenant-originalarrivaltime: 01 May 2019 21:18:05.9342 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB4847 Subject: Re: [dpdk-dev] [PATCH v9 0/4] lib/rcu: add RCU library supporting QSBR mechanism X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Message-ID: <20190501211805.a9tbaPBFr-OBE7ciJG47fe3pfRPFVqZ9ynL_1ab0y6w@z> > Subject: Re: [dpdk-dev] [PATCH v9 0/4] lib/rcu: add RCU library supportin= g > QSBR mechanism >=20 > On Wed, May 01, 2019 at 02:56:48PM +0000, Honnappa Nagarahalli wrote: > > > > > > On Tue, Apr 30, 2019 at 10:54:15PM -0500, Honnappa Nagarahalli wrote: > > > > Lock-less data structures provide scalability and determinism. > > > > They enable use cases where locking may not be allowed (for ex: > > > > real-time applications). > > > > > > > I know this is version 9 of the patch, so I'm sorry for the late > > > comment, but I have to ask: Why re-invent this wheel? There are > > > already several Userspace > > Thanks Neil, for asking the question. This has been debated before. Ple= ase > refer to [2] for more details. > > > > liburcu [1] was explored as it seemed to be familiar to others in the > community . I am not aware of any other library. > > > > There are unique requirements in DPDK and there is still scope for > improvement from what is available. I have explained this in the cover le= tter > without making a direct comparison to liburcu. May be it is worth tweakin= g the > documentation to call this out explicitly. > > > I think what you're referring to here is the need for multiple QSBR varia= bles, > yes? I'm not sure thats, strictly speaking, a requirement. It seems lik= e its a > performance improvement, but I'm not sure thats the case (see performance > numbers below). DPDK supports service cores feature and pipeline mode where a particular da= ta structure is used by a subset of readers. These use cases affect the wri= ter and reader (which are on the data plane) in the following ways: 1) The writer does not need to wait for all the readers to complete the qui= escent state. Writer does not need to spend CPU cycles and add to memory ba= ndwidth polling the unwanted readers. DPDK has uses cases where the writer = is on the data plane as well. 2) The readers that do not use the data structure do not have to spend cycl= es reporting their quiescent state. Note that these are data plane cycles Other than this, please read about how grace period and critical section af= fect the over head introduced by QSBR mechanism in the cover letter. It als= o explains how this library solves this issue. This is discussed in the discussion thread I provided earlier. >=20 > Regarding performance, we can't keep using raw performance as a trump car= d IMO, performance is NOT a 'trump card'. The whole essence of DPDK is perfor= mance. If not for performance, would DPDK exist? > for all other aspects of the DPDK. This entire patch is meant to improve > performance, it seems like it would be worthwhile to gain the code > consolidation and reuse benefits for the minor performance hit. Apologies, I did not understand this. Can you please elaborate code consoli= dation part? >=20 > Further to performance, I may be misreading this, but I ran the integrate= d > performance test you provided in this patch, as well as the benchmark tes= ts for > liburcw (trimmed for easier reading here) Just to be sure, I believe you are referring to *liburcu* >=20 > liburcw: > [nhorman@hmswarspite benchmark]$ ./test_urcu 7 1 1 -v -a 0 -a 1 -a 2 -a 3= -a > 4 -a 5 -a 6 -a 7 -a 0 Adding CPU 0 affinity Adding CPU 1 affinity Adding = CPU 2 > affinity Adding CPU 3 affinity Adding CPU 4 affinity Adding CPU 5 affinit= y > Adding CPU 6 affinity Adding CPU 7 affinity Adding CPU 0 affinity running= test > for 1 seconds, 7 readers, 1 writers. > Writer delay : 0 loops. > Reader duration : 0 loops. > thread main , tid 22712 > thread_begin reader, tid 22726 > thread_begin reader, tid 22729 > thread_begin reader, tid 22728 > thread_begin reader, tid 22727 > thread_begin reader, tid 22731 > thread_begin reader, tid 22730 > thread_begin reader, tid 22732 > thread_begin writer, tid 22733 > thread_end reader, tid 22729 > thread_end reader, tid 22731 > thread_end reader, tid 22730 > thread_end reader, tid 22728 > thread_end reader, tid 22727 > thread_end writer, tid 22733 > thread_end reader, tid 22726 > thread_end reader, tid 22732 > total number of reads : 1185640774, writes 264444 > SUMMARY /home/nhorman/git/userspace-rcu/tests/benchmark/.libs/lt- > test_urcu testdur 1 nr_readers 7 rdur 0 wdur 0 nr_writers = 1 wdelay > 0 nr_reads 1185640774 nr_writes 264444 nr_ops 1185905218 >=20 > DPDK: > Perf test: 1 writer, 7 readers, 1 QSBR variable, 1 QSBR Query, Non-Blocki= ng > QSBR check Following numbers include calls to rte_hash functions Cycles p= er 1 > update(online/update/offline): 813407 Cycles per 1 check(start, check): > 859679 >=20 >=20 > Both of these tests qsbr rcu in each library using 7 readers and 1 writer= . Its a > little bit of an apples to oranges comparison, as the tests run using sli= ghtly Thanks for running the test. Yes, it is apples to oranges comparison: 1) The test you are running is not the correct test assuming the code for t= his test is [3] 2) This is not QSBR I suggest you use [4] for your testing. It also need further changes to mat= ch the test case in this patch. The function 'thr_reader' reports quiescent= state every 1024 iterations, please change it to report every iteration. After this you need to compare these results with the first test case in th= is patch. [3] https://github.com/urcu/userspace-rcu/blob/master/tests/benchmark/test_= urcu.c [4] https://github.com/urcu/userspace-rcu/blob/master/tests/benchmark/test_= urcu_qsbr.c > different parameters, and produce different output statistics, but I thin= k they > can be somewhat normalized. Primarily the stat that stuck out to me was = the > DPDK Cycles per 1 update statistic, which I believe is effectively the nu= mber of > cycles spent in the test / the number of writer updates. On DPDK that nu= mber > in this test run works out to 813407. In the liburcw test, it reports th= e total > number of ops (cycles), and the number of writes completed within those > cycles. > If we do the same division there we get 185905218 / 264444 =3D 4484. I ma= y be > misreading something here, but that seems like a pretty significant write= side Yes, you are misreading. 'number of ops' is not cycles. It is sum of 'nr_wr= ites' and 'nr_reads'. The test runs for 1 sec (uses 'sleep'), so these are = number of operations done in 1 sec. You need to normalize to number of cycl= es using this data. > performance improvement over this implementation. >=20 > Neil >=20 > > [1] https://liburcu.org/ > > [2] http://mails.dpdk.org/archives/dev/2018-November/119875.html > > > > > RCU libraries that are mature and carried by Linux and BSD distributi= ons. > > > Why would we throw another one into DPDK instead of just using whats > > > already available, mature and stable? > > > > > > Neil > > > >