From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR01-VE1-obe.outbound.protection.outlook.com (mail-eopbgr140077.outbound.protection.outlook.com [40.107.14.77]) by dpdk.org (Postfix) with ESMTP id 3B97B1B51E for ; Fri, 30 Nov 2018 03:13:34 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector1-arm-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=IsDt5pUa/CAVCoSX5NyQLu/RO1eIz+/TixTwrvCcxuU=; b=KzX//as7rqBoy/BjH0k1u92Ct8ZJcR3dsbdoBuPGEFe4pvnNdsj8xcTXct1VTWX0VQnkh0i/iupi8ywD5TnQTrjzQcTZBTuGAOBx7uGbtY3QVhfclKCYfW+3DgP+wPIeanU24439pROAXzsjLH1xncHBIHoNYrGXiv1AbIAAjUs= Received: from AM6PR08MB3672.eurprd08.prod.outlook.com (20.177.115.29) by AM6PR08MB3782.eurprd08.prod.outlook.com (20.178.89.23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1361.19; Fri, 30 Nov 2018 02:13:32 +0000 Received: from AM6PR08MB3672.eurprd08.prod.outlook.com ([fe80::78ab:2bf4:5476:6c3e]) by AM6PR08MB3672.eurprd08.prod.outlook.com ([fe80::78ab:2bf4:5476:6c3e%2]) with mapi id 15.20.1382.019; Fri, 30 Nov 2018 02:13:32 +0000 From: Honnappa Nagarahalli To: Stephen Hemminger CC: "Van Haaren, Harry" , "dev@dpdk.org" , nd , Dharmik Thakkar , Malvika Gupta , "Gavin Hu (Arm Technology China)" , Honnappa Nagarahalli , nd Thread-Topic: [dpdk-dev] [RFC 0/3] tqs: add thread quiescent state library Thread-Index: AQHUhqB8tEevwcHR9UqXBYKYUzkOUKVkOZ6AgABmmXCAATVfgIAAWdaA Date: Fri, 30 Nov 2018 02:13:32 +0000 Message-ID: References: <20181122033055.3431-1-honnappa.nagarahalli@arm.com> <20181127142803.423c9b00@xeon-e3> <20181128152351.27fdebe3@xeon-e3> In-Reply-To: <20181128152351.27fdebe3@xeon-e3> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Honnappa.Nagarahalli@arm.com; x-originating-ip: [217.140.111.135] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; AM6PR08MB3782; 6:A/aYK60z7MP/I99qRa35GhAUbdiM+3ukyRAmWYSGFLndflKZia31pYnhnHM5M2DlQFGLXf/9d5DB/VdNMOOkA8o0QrzX6GRPuhoSgJD8oWoZfR//+nyto4MrLGa3vQ7PFNmw8IRMgntHPjylglZSOK5W5ELoIPWP5CSV3qEKxJuHO//Babr+xkgP7xP7PDjoN8CN0zWm2TZjEGGo5R+G66/+Q2mKpSG7x5N2GIYytI/mJn9B69SPjQ38W2xarOqimAngGPmM26uk8/GrSuqSHVsFBQ6Mwa/HcXhli+cpCalT8DNpS+cd5bxU24uYLmrpMtMnNxixRdlOHG/lDsD7d3Fc2pqdt/TH+S5zaBiMcVnVBVgi8Zk1GVSDM370l71ukwfzhAsHixDr3WUUILvKvEeyFUnn7wbLjV29tGkAn+81jdWjP9db5ClADs4w+y57Kmpw17zWzhDDD7HKC6gatg==; 5:ymNX+v3k8GdJkCwzR2Cn1pJj1ecf5IiazsDBBk5HCEcfjDkWvW52pT1CYjkqNJMtPCkIK/xNRZEubqt8c+loH1fGxCEicwtqOWilzK7SznLielEZO3vXvfIb5NoG+2fgx4hBz3O21GE2eGumIb2Cz4pwu9VBW3qIy7H/mKcDyLs=; 7:D6LZsquwZleoy5a2CpUXet3RCmgntqnl29Q/4MPDKWqCqTHJJuFzW8I6nwRAzmlYUfVwuEdnVhEDTmGreRhUQN0z4JgD3OxAzEws0obke2tBxLVxwcPbVLIY2s5GJGA1msJd9V4zgLpBBtysan8/Ow== x-ms-exchange-antispam-srfa-diagnostics: SOS;SOR; x-ms-office365-filtering-correlation-id: fc932ffb-a697-42d7-685e-08d656696d56 x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390098)(7020095)(4652040)(8989299)(5600074)(711020)(4618075)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(2017052603328)(7153060)(7193020); SRVR:AM6PR08MB3782; x-ms-traffictypediagnostic: AM6PR08MB3782: nodisclaimer: True x-microsoft-antispam-prvs: x-ms-exchange-senderadcheck: 1 x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(5005006)(8121501046)(93006095)(93001095)(3231453)(999002)(944501410)(52105112)(3002001)(10201501046)(6055026)(148016)(149066)(150057)(6041310)(20161123560045)(20161123558120)(20161123564045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123562045)(201708071742011)(7699051)(76991095); SRVR:AM6PR08MB3782; BCL:0; PCL:0; RULEID:; SRVR:AM6PR08MB3782; x-forefront-prvs: 087223B4DA x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(346002)(376002)(396003)(136003)(366004)(39860400002)(199004)(189003)(97736004)(66066001)(7696005)(14444005)(71200400001)(6916009)(256004)(25786009)(71190400001)(81166006)(446003)(476003)(3846002)(2906002)(81156014)(4326008)(8936002)(6116002)(486006)(8676002)(478600001)(33656002)(229853002)(14454004)(72206003)(316002)(186003)(99286004)(54906003)(102836004)(6246003)(966005)(305945005)(6436002)(7736002)(11346002)(26005)(74316002)(106356001)(93886005)(105586002)(6506007)(9686003)(53936002)(6306002)(68736007)(76176011)(5660300001)(55016002)(86362001); DIR:OUT; SFP:1101; SCL:1; SRVR:AM6PR08MB3782; H:AM6PR08MB3672.eurprd08.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: Oc++6ag7wmMZLVgD+o8FDWfYBo7IQ32dLI/hAVTKZ42OFvaIXKA7F6n31JdhsULvMxrCK/uHFAptkO7/YGzhAZxI1ePWDbRhW/Y4uRWMqa6EBoYYW6rzyPWVfpGB4klEbzXSr87mEwlZIW0Jx5EhOWhE1dih49MshfBAO6UJijW6SilVAB6oLfDO2Tl075kAfIuHKIci+2HspJhmjb4CDUlofhjHNk62dndo85/aBSHNAKeFsIk1Bk44Tia86z1SuOZdZrioPNcsPHNTiudKel5Gbki+Qiw6GuJZy3QN0zEiPultufCucrWu8gAjSzMzMUzsEZ1JmpLeXrMPgXhzh3JBd0p3k76uiDA+BYhvfIs= spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-Network-Message-Id: fc932ffb-a697-42d7-685e-08d656696d56 X-MS-Exchange-CrossTenant-originalarrivaltime: 30 Nov 2018 02:13:32.4344 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM6PR08MB3782 Subject: Re: [dpdk-dev] [RFC 0/3] tqs: add thread quiescent state library X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Nov 2018 02:13:34 -0000 >=20 > > > > Mixed feelings about this one. > > > > > > > > Love to see RCU used for more things since it is much better than > > > > reader/writer locks for many applications. But hate to see DPDK > > > > reinventing every other library and not reusing code. Userspace > > > > RCU https://liburcu.org/ is widely supported by distro's, more > > > > throughly tested and documented, and more flexiple. > > > > > > > > The issue with many of these reinventions is a tradeoff of DPDK > > > > growing another dependency on external library versus using common > code. > > > > > > Agree with the dependency issues. Sometimes flexibility also causes con= fusion > and features that are not necessarily required for a targeted use case. I= have > seen that much of the functionality that can be left to the application i= s > implemented as part of the library. > > I think having it in DPDK will give us control over the amount of capab= ility this > library will have and freedom over changes we would like to make to such = a > library. I also view DPDK as one package where all things required for da= ta > plane development are available. > > > > > > For RCU, the big issue for me is the testing and documentation of > > > > how to do RCU safely. Many people get it wrong! > > Hopefully, we all will do a better job collectively :) > > > > > >=20 > Reinventing RCU is not helping anyone. IMO, this depends on what the rte_tqs has to offer and what the requirement= s are. Before starting this patch, I looked at the liburcu APIs. I have to = say, fairly quickly (no offense) I concluded that this does not address DPD= K's needs. I took a deeper look at the APIs/code in the past day and I stil= l concluded the same. My partial analysis (analysis of more APIs can be don= e, I do not have cycles at this point) is as follows: The reader threads' information is maintained in a linked list[1]. This lin= ked list is protected by a mutex lock[2]. Any additions/deletions/traversal= s of this list are blocking and cannot happen in parallel. The API, 'synchronize_rcu' [3] (similar functionality to rte_tqs_check call= ) is a blocking call. There is no option provided to make it non-blocking. = The writer spins cycles while waiting for the grace period to get over. 'synchronize_rcu' also has grace period lock [4]. If I have multiple writer= s running on data plane threads, I cannot call this API to reclaim the memo= ry in the worker threads as it will block other worker threads. This means,= there is an extra thread required (on the control plane?) which does garba= ge collection and a method to push the pointers from worker threads to the = garbage collection thread. This also means the time duration from delete to= free increases putting pressure on amount of memory held up. Since this API cannot be called concurrently by multiple writers, each writ= er has to wait for other writer's grace period to get over (i.e. multiple w= riter threads cannot overlap their grace periods). This API also has to traverse the linked list which is not very well suited= for calling on data plane. I have not gone too much into rcu_thread_offline[5] API. This again needs t= o be used in worker cores and does not look to be very optimal. I have glanced at rcu_quiescent_state [6], it wakes up the thread calling '= synchronize_rcu' which seems good amount of code for the data plane. [1] https://github.com/urcu/userspace-rcu/blob/master/include/urcu/static/u= rcu-qsbr.h#L85 [2] https://github.com/urcu/userspace-rcu/blob/master/src/urcu-qsbr.c#L68 [3] https://github.com/urcu/userspace-rcu/blob/master/src/urcu-qsbr.c#L344 [4] https://github.com/urcu/userspace-rcu/blob/master/src/urcu-qsbr.c#L58 [5] https://github.com/urcu/userspace-rcu/blob/master/include/urcu/static/u= rcu-qsbr.h#L211 [6] https://github.com/urcu/userspace-rcu/blob/master/include/urcu/static/u= rcu-qsbr.h#L193 Coming to what is provided in rte_tqs: The synchronize_rcu functionality is split in to 2 APIs: rte_tqs_start and = rte_tqs_check. The reader data is maintained as an array. Both the APIs are lock-free, allowing them to be called from multiple threa= ds concurrently. This allows multiple writers to wait for their grace perio= ds concurrently as well as overlap their grace periods. rte_tqs_start API r= eturns a token which provides the ability to separate the quiescent state w= aiting of different writers. Hence, no writer waits for other writer's grac= e period to get over.=20 Since these 2 APIs are lock-free, they can be called from writers running o= n worker cores as well without the need for a separate thread to do garbage= collection. The separation into 2 APIs provides the ability for writers to not spin cyc= les waiting for the grace period to get over. This enables different ways o= f doing garbage collection. For ex: a data structure delete API could remov= e the entry from the data structure, call rte_tqs_start and return back to = the caller. On the invocation of next API call of the library, the API can = call rte_tqs_check (which will mostly indicate that the grace period is com= plete) and free the previously deleted entry. rte_tqs_update (mapping to rcu_quiescent_state) is pretty small and simple. rte_tqs_register and rte_tqs_unregister APIs are lock free. Hence additiona= l APIs like rcu_thread_online and rcu_thread_offline are not required. The = rte_tqs_unregister API (when compared to rcu_thread_offline) is much simple= and conducive to be used in worker threads. >=20 >=20 > DPDK needs to fix its dependency model, and just admit that it is ok to b= uild off > of more than glibc. >=20 > Having used liburcu, it can be done in a small manner and really isn't th= at > confusing. >=20 > Is your real issue the LGPL license of liburcu for your skittish customer= s? I have not had any discussions on this. Customers are mainly focused on hav= ing a solution on which they have meaningful control. They want to be able = to submit a patch and change things if required. For ex: barriers for Arm [= 7] are not optimal. How easy is it to change this and get it into distros (= there are both internal and external factors here)? [7] https://github.com/urcu/userspace-rcu/blob/master/include/urcu/arch/arm= .h#L44