From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pl1-f195.google.com (mail-pl1-f195.google.com [209.85.214.195]) by dpdk.org (Postfix) with ESMTP id E6D5B4C8D for ; Sat, 1 Dec 2018 00:44:20 +0100 (CET) Received: by mail-pl1-f195.google.com with SMTP id t13so3514444ply.13 for ; Fri, 30 Nov 2018 15:44:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=mBNv0adBcLsElQ3nxteHAymDK4XXm5xpLpLYT9u+c94=; b=EC0Ymdt3l5ykjPfP5LbthIKzH7lCFWq7ga2rpbHEaURCB64PXgPTbqMB0eFhJ/5kRH AZuF8axr/irznpRyQuEIRYJ2VnSNjhYSEruHkskmJBNowvTHj29gaCpoQCpRaFKWOekh Q3blSyKr5GEjqKh83z6tIrlUyHUtKDPWSXi7scKVd+xZES3zBN2kXhFpX6cn3zcPxRlZ ZZSfuTAILULGBIScYzdrOPSDQQmUWMRRx9Ssxa8su0x8GlRvU6P5gO4IQU0dd/SKP5PN /imlbri/NRKlmjOaST1Nt2aINyl7zIfXnsoC57tjEUeMp3j0zqaqbp6fv2e+ejUWaot/ BaYA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=mBNv0adBcLsElQ3nxteHAymDK4XXm5xpLpLYT9u+c94=; b=U+R62CPGEdRGjOivTVX0FZUFmr4xqj4yXDpOGbmgm2WBfuwiPGTkWqLRiGm8kbB/b4 hikzUFtkdHm+0NmlvpkX8yy5NvbuRgAqSD7UyVASWs7UsNtR6gPwuO0xGUKpFYbEtVpS 5ylhTps7l8zjVTkTl6SXZJrRRrooNqNpvh+Oi/A1HO4t+GJN5GZ7iSi0eE+JcDbEtsQA DEAPZbUlPVzgs7f7BE8rlFq17u+tzddVwZ+e9tWFfRBpO4kxv1m7nWZL1laNr36vUwpC aa4nvi/xREsNeFQIVRb1yxXR7199CQmIFj1ICoVDeionavqxhQbjwuBX3Rv83f4CbOBR uGXw== X-Gm-Message-State: AA+aEWayJm8qrjxxo6DzAdG+lxu82K74RaSonStugw3nRmncVYjugKtR u/hGhqN7H5FeyfJVJZn0Qz51Jg== X-Google-Smtp-Source: AFSGD/VfxQwGTPhjbtFojjkoaLJNbL8eKGpqRm+Cg5t5iJkKFP7uFEwemNg6rROJxjeQRD6BFLWC+w== X-Received: by 2002:a17:902:bb98:: with SMTP id m24mr7341696pls.71.1543621459905; Fri, 30 Nov 2018 15:44:19 -0800 (PST) Received: from xeon-e3 (204-195-22-127.wavecable.com. [204.195.22.127]) by smtp.gmail.com with ESMTPSA id z186sm10414449pfz.119.2018.11.30.15.44.19 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 30 Nov 2018 15:44:19 -0800 (PST) Date: Fri, 30 Nov 2018 15:44:17 -0800 From: Stephen Hemminger To: Mattias =?UTF-8?B?UsO2bm5ibG9t?= Cc: Honnappa Nagarahalli , "Van Haaren, Harry" , "dev@dpdk.org" , nd , Dharmik Thakkar , Malvika Gupta , "Gavin Hu (Arm Technology China)" Message-ID: <20181130154417.7cf7349b@xeon-e3> In-Reply-To: <5c5db46f-e154-7932-0905-031e153c6016@ericsson.com> References: <20181122033055.3431-1-honnappa.nagarahalli@arm.com> <20181127142803.423c9b00@xeon-e3> <20181128152351.27fdebe3@xeon-e3> <5c5db46f-e154-7932-0905-031e153c6016@ericsson.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [dpdk-dev] [RFC 0/3] tqs: add thread quiescent state library X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Nov 2018 23:44:21 -0000 On Fri, 30 Nov 2018 21:56:30 +0100 Mattias R=C3=B6nnblom wrote: > On 2018-11-30 03:13, Honnappa Nagarahalli wrote: > >> > >> Reinventing RCU is not helping anyone. =20 > > IMO, this depends on what the rte_tqs has to offer and what the require= ments are. Before starting this patch, I looked at the liburcu APIs. I have= to say, fairly quickly (no offense) I concluded that this does not address= DPDK's needs. I took a deeper look at the APIs/code in the past day and I = still concluded the same. My partial analysis (analysis of more APIs can be= done, I do not have cycles at this point) is as follows: > >=20 > > The reader threads' information is maintained in a linked list[1]. This= linked list is protected by a mutex lock[2]. Any additions/deletions/trave= rsals of this list are blocking and cannot happen in parallel. > >=20 > > The API, 'synchronize_rcu' [3] (similar functionality to rte_tqs_check = call) is a blocking call. There is no option provided to make it non-blocki= ng. The writer spins cycles while waiting for the grace period to get over. > > =20 >=20 > Wouldn't the options be call_rcu, which rarely blocks, or defer_rcu()=20 > which never? Why would the average application want to wait for the=20 > grace period to be over anyway? >=20 > > 'synchronize_rcu' also has grace period lock [4]. If I have multiple wr= iters running on data plane threads, I cannot call this API to reclaim the = memory in the worker threads as it will block other worker threads. This me= ans, there is an extra thread required (on the control plane?) which does g= arbage collection and a method to push the pointers from worker threads to = the garbage collection thread. This also means the time duration from delet= e to free increases putting pressure on amount of memory held up. > > Since this API cannot be called concurrently by multiple writers, each = writer has to wait for other writer's grace period to get over (i.e. multip= le writer threads cannot overlap their grace periods). =20 >=20 > "Real" DPDK applications typically have to interact with the outside=20 > world using interfaces beyond DPDK packet I/O, and this is best done via= =20 > an intermediate "control plane" thread running in the DPDK application.=20 > Typically, this thread would also be the RCU writer and "garbage=20 > collector", I would say. >=20 > >=20 > > This API also has to traverse the linked list which is not very well su= ited for calling on data plane. > >=20 > > I have not gone too much into rcu_thread_offline[5] API. This again nee= ds to be used in worker cores and does not look to be very optimal. > >=20 > > I have glanced at rcu_quiescent_state [6], it wakes up the thread calli= ng 'synchronize_rcu' which seems good amount of code for the data plane. > > =20 >=20 > Wouldn't the typical DPDK lcore worker call rcu_quiescent_state() after=20 > processing a burst of packets? If so, I would more lean toward=20 > "negligible overhead", than "a good amount of code". >=20 > I must admit I didn't look at your library in detail, but I must still=20 > ask: if TQS is basically RCU, why isn't it called RCU? And why isn't the= =20 > API calls named in a similar manner? We used liburcu at Brocade with DPDK. It was just a case of putting rcu_qui= escent_state in the packet handling loop. There were a bunch more cases where control thread needed to register= /unregister as part of RCU. I think any library would have that issue with user supplied threads. You = need a "worry about me" and a "don't worry about me" API in the library. There is also a tradeoff between call_rcu and defer_rcu about what context = the RCU callback happens. You really need a control thread to handle the RCU cleanup. The point is that RCU steps into the application design, and liburcu seems = to be flexible enough and well documented enough to allow for more options.