From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf1-f196.google.com (mail-pf1-f196.google.com [209.85.210.196]) by dpdk.org (Postfix) with ESMTP id CC43D1B587 for ; Fri, 30 Nov 2018 19:32:10 +0100 (CET) Received: by mail-pf1-f196.google.com with SMTP id i12so3208209pfo.7 for ; Fri, 30 Nov 2018 10:32:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Q1FBnZ8iH8fZXIyy96/5nuQ4C/EA0F84enhk4fgTOOc=; b=IzTB/dFH1mYIpF4fetVBXGa8quF+PRwpEFxFSF3VDFmYpsZB54PXaMDpgD+/ZgcwpC seBnT3ytInbh7cE8emRyN/0j8K3mLO27uf7JAe5JFweAaKxFg+9qPa6jwwK7ZfesNZcd 9QAtRS9e+79XfCts8fapfjBTiR06F0QjIE3eSHn+384YMp27ItD1szRihlz+O/wdqdQV af7JT+wLtNMmla8aARJP9sKD76SDSIXjF70Wa20Tq1RmkMUJaE5mtzFr+ssTPltWfB/l 3y21A14ubvmJvLETEdF2HphMpT0XDWsEKvGRHGu8USYXR/QpKD6wlyvZhfBYCg4HPjaA tXCQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Q1FBnZ8iH8fZXIyy96/5nuQ4C/EA0F84enhk4fgTOOc=; b=RUl25YBuOPYK3sSvAC8Hbps51uiQ+mqn0j+1hzPgzZbyMSyB8iRYLI20BojfpFrT91 gk8z722su4L82z707MFQAt3IHkYkqi5kKHvgWBSh7aEU8b3EruQ57zCZsWcXtSW2rbmZ wTfDGhSHYbbqXT7FLOCRZiTCpY7tLrbqyzJkF/Wzkx+WXwSoyDY4BoAUgyk+CCX/wTbz 5CnKj2YMa1reJ16a0rlNBm4A+oUzbeAuM02TAU4fvh/jLmg7hvR+ipxrUxo47IlKqR2y jwvJqK6KgsHecI3rGk3PUBpU7MrIKOZyk0zZfhNf4ZM+2ZxYli2s/0BUSk9BdEOrlo45 LJTQ== X-Gm-Message-State: AA+aEWZSrkoKM0+puzxgE9c2CCGdFXIppWT3xwIbjrvlGo5B93YNfG8h Z+e8QajRfrJqkF0m7GO6j63o0g== X-Google-Smtp-Source: AFSGD/UorOlY4ngGWIyGLQS6h1ZeOmx/127hIJVOeXy0zW8mL+LPFiLGbBNXtyx0oPp0732mLkPuiw== X-Received: by 2002:a62:4641:: with SMTP id t62mr6564618pfa.141.1543602729818; Fri, 30 Nov 2018 10:32:09 -0800 (PST) Received: from xeon-e3 (204-195-22-127.wavecable.com. [204.195.22.127]) by smtp.gmail.com with ESMTPSA id g5sm11742189pgu.57.2018.11.30.10.32.09 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 30 Nov 2018 10:32:09 -0800 (PST) Date: Fri, 30 Nov 2018 10:32:07 -0800 From: Stephen Hemminger To: Luca Boccassi Cc: Honnappa Nagarahalli , "Van Haaren, Harry" , "dev@dpdk.org" , nd , Dharmik Thakkar , Malvika Gupta , "Gavin Hu (Arm Technology China)" Message-ID: <20181130103207.4f4f008d@xeon-e3> In-Reply-To: <1543595185.18215.5.camel@debian.org> References: <20181122033055.3431-1-honnappa.nagarahalli@arm.com> <20181127142803.423c9b00@xeon-e3> <20181128152351.27fdebe3@xeon-e3> <1543595185.18215.5.camel@debian.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [dpdk-dev] [RFC 0/3] tqs: add thread quiescent state library X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Nov 2018 18:32:11 -0000 On Fri, 30 Nov 2018 16:26:25 +0000 Luca Boccassi wrote: > On Fri, 2018-11-30 at 02:13 +0000, Honnappa Nagarahalli wrote: > > > =20 > > > > > > Mixed feelings about this one. > > > > > >=20 > > > > > > Love to see RCU used for more things since it is much better > > > > > > than > > > > > > reader/writer locks for many applications. But hate to see > > > > > > DPDK > > > > > > reinventing every other library and not reusing code. > > > > > > Userspace > > > > > > RCU https://liburcu.org/ is widely supported by distro's, > > > > > > more > > > > > > throughly tested and documented, and more flexiple. > > > > > >=20 > > > > > > The issue with many of these reinventions is a tradeoff of > > > > > > DPDK > > > > > > growing another dependency on external library versus using > > > > > > common =20 > > >=20 > > > code. =20 > > > > > > =20 > > > >=20 > > > > Agree with the dependency issues. Sometimes flexibility also > > > > causes confusion =20 > > >=20 > > > and features that are not necessarily required for a targeted use > > > case. I have > > > seen that much of the functionality that can be left to the > > > application is > > > implemented as part of the library. =20 > > > > I think having it in DPDK will give us control over the amount of > > > > capability this =20 > > >=20 > > > library will have and freedom over changes we would like to make to > > > such a > > > library. I also view DPDK as one package where all things required > > > for data > > > plane development are available. =20 > > > > =20 > > > > > > For RCU, the big issue for me is the testing and > > > > > > documentation of > > > > > > how to do RCU safely. Many people get it wrong! =20 > > > >=20 > > > > Hopefully, we all will do a better job collectively :) > > > > =20 > > > > > =20 > > >=20 > > > Reinventing RCU is not helping anyone. =20 > >=20 > > IMO, this depends on what the rte_tqs has to offer and what the > > requirements are. Before starting this patch, I looked at the liburcu > > APIs. I have to say, fairly quickly (no offense) I concluded that > > this does not address DPDK's needs. I took a deeper look at the > > APIs/code in the past day and I still concluded the same. My partial > > analysis (analysis of more APIs can be done, I do not have cycles at > > this point) is as follows: > >=20 > > The reader threads' information is maintained in a linked list[1]. > > This linked list is protected by a mutex lock[2]. Any > > additions/deletions/traversals of this list are blocking and cannot > > happen in parallel. > >=20 > > The API, 'synchronize_rcu' [3] (similar functionality to > > rte_tqs_check call) is a blocking call. There is no option provided > > to make it non-blocking. The writer spins cycles while waiting for > > the grace period to get over. > >=20 > > 'synchronize_rcu' also has grace period lock [4]. If I have multiple > > writers running on data plane threads, I cannot call this API to > > reclaim the memory in the worker threads as it will block other > > worker threads. This means, there is an extra thread required (on the > > control plane?) which does garbage collection and a method to push > > the pointers from worker threads to the garbage collection thread. > > This also means the time duration from delete to free increases > > putting pressure on amount of memory held up. > > Since this API cannot be called concurrently by multiple writers, > > each writer has to wait for other writer's grace period to get over > > (i.e. multiple writer threads cannot overlap their grace periods). > >=20 > > This API also has to traverse the linked list which is not very well > > suited for calling on data plane. > >=20 > > I have not gone too much into rcu_thread_offline[5] API. This again > > needs to be used in worker cores and does not look to be very > > optimal. > >=20 > > I have glanced at rcu_quiescent_state [6], it wakes up the thread > > calling 'synchronize_rcu' which seems good amount of code for the > > data plane. > >=20 > > [1] https://github.com/urcu/userspace-rcu/blob/master/include/urcu/st > > atic/urcu-qsbr.h#L85 > > [2] https://github.com/urcu/userspace-rcu/blob/master/src/urcu-qsbr.c > > #L68 > > [3] https://github.com/urcu/userspace-rcu/blob/master/src/urcu-qsbr.c > > #L344 > > [4] https://github.com/urcu/userspace-rcu/blob/master/src/urcu-qsbr.c > > #L58 > > [5] https://github.com/urcu/userspace-rcu/blob/master/include/urcu/st > > atic/urcu-qsbr.h#L211 > > [6] https://github.com/urcu/userspace-rcu/blob/master/include/urcu/st > > atic/urcu-qsbr.h#L193 > >=20 > > Coming to what is provided in rte_tqs: > >=20 > > The synchronize_rcu functionality is split in to 2 APIs: > > rte_tqs_start and rte_tqs_check. The reader data is maintained as an > > array. > >=20 > > Both the APIs are lock-free, allowing them to be called from multiple > > threads concurrently. This allows multiple writers to wait for their > > grace periods concurrently as well as overlap their grace periods. > > rte_tqs_start API returns a token which provides the ability to > > separate the quiescent state waiting of different writers. Hence, no > > writer waits for other writer's grace period to get over.=C2=A0 > > Since these 2 APIs are lock-free, they can be called from writers > > running on worker cores as well without the need for a separate > > thread to do garbage collection. > >=20 > > The separation into 2 APIs provides the ability for writers to not > > spin cycles waiting for the grace period to get over. This enables > > different ways of doing garbage collection. For ex: a data structure > > delete API could remove the entry from the data structure, call > > rte_tqs_start and return back to the caller. On the invocation of > > next API call of the library, the API can call rte_tqs_check (which > > will mostly indicate that the grace period is complete) and free the > > previously deleted entry. > >=20 > > rte_tqs_update (mapping to rcu_quiescent_state) is pretty small and > > simple. > >=20 > > rte_tqs_register and rte_tqs_unregister APIs are lock free. Hence > > additional APIs like rcu_thread_online and rcu_thread_offline are not > > required. The rte_tqs_unregister API (when compared to > > rcu_thread_offline) is much simple and conducive to be used in worker > > threads. =20 >=20 > liburcu has many flavours already, qsbr being one of them. If none of > those are optimal for this use case, why not work with upstream to > either improve the existing flavours, or add a new one? >=20 > You have the specific knowledge and expertise about the requirements > needs for the implementation for this use case, and they have the long- > time and extensive experience maintaining the library on a wide range > of systems and use cases. Why not combine both? > I might be wrong, but to me, nothing described above seems to be > particularly or uniquely tied to implementing a software dataplane or > to DPDK. IMHO we should pool and share wherever possible, rather than > build an ecosystem closed onto itself. >=20 > Just my 2c of course! >=20 > > > DPDK needs to fix its dependency model, and just admit that it is > > > ok to build off > > > of more than glibc. > > >=20 > > > Having used liburcu, it can be done in a small manner and really > > > isn't that > > > confusing. > > >=20 > > > Is your real issue the LGPL license of liburcu for your skittish > > > customers? =20 > >=20 > > I have not had any discussions on this. Customers are mainly focused > > on having a solution on which they have meaningful control. They want > > to be able to submit a patch and change things if required. For ex: > > barriers for Arm [7] are not optimal. How easy is it to change this > > and get it into distros (there are both internal and external factors > > here)? =20 >=20 > It's just as easy (or as hard) as it is with DPDK. So it's either wait > for the distros to update, or rebuild locally. >=20 Either way it would be useful to have the broader RCU community in on the discussion.