From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 1C0D545AC8; Sun, 6 Oct 2024 16:43:41 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id DC045402C3; Sun, 6 Oct 2024 16:43:40 +0200 (CEST) Received: from mail.lysator.liu.se (mail.lysator.liu.se [130.236.254.3]) by mails.dpdk.org (Postfix) with ESMTP id 9A5554025D for ; Sun, 6 Oct 2024 16:43:39 +0200 (CEST) Received: from mail.lysator.liu.se (localhost [127.0.0.1]) by mail.lysator.liu.se (Postfix) with ESMTP id 5C612D63C for ; Sun, 6 Oct 2024 16:43:39 +0200 (CEST) Received: by mail.lysator.liu.se (Postfix, from userid 1004) id 476B5D70A; Sun, 6 Oct 2024 16:43:39 +0200 (CEST) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on hermod.lysator.liu.se X-Spam-Level: X-Spam-Status: No, score=-1.2 required=5.0 tests=ALL_TRUSTED,AWL, T_SCC_BODY_TEXT_LINE autolearn=disabled version=4.0.0 X-Spam-Score: -1.2 Received: from [192.168.1.85] (h-62-63-215-114.A163.priv.bahnhof.se [62.63.215.114]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mail.lysator.liu.se (Postfix) with ESMTPSA id 4F0D0D63B; Sun, 6 Oct 2024 16:43:37 +0200 (CEST) Message-ID: <4314ffce-38f1-4b0a-8673-55d201e20002@lysator.liu.se> Date: Sun, 6 Oct 2024 16:43:37 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC v2 0/2] Add high-performance timer facility To: =?UTF-8?Q?Morten_Br=C3=B8rup?= , Stephen Hemminger , =?UTF-8?Q?Mattias_R=C3=B6nnblom?= Cc: dev@dpdk.org, Erik Gabriel Carrillo , David Marchand , maria.lingemark@ericsson.com, Stefan Sundkvist , Tyler Retzlaff References: <20230228093916.87206-1-mattias.ronnblom@ericsson.com> <20230315170342.214127-1-mattias.ronnblom@ericsson.com> <20241003113632.2be4c2b7@hermes.local> <98CBD80474FA8B44BF855DF32C47DC35E9F771@smartserver.smartshare.dk> <3229837d-fd8e-417e-8eb3-1a5c621ff0ee@lysator.liu.se> <98CBD80474FA8B44BF855DF32C47DC35E9F780@smartserver.smartshare.dk> Content-Language: en-US From: =?UTF-8?Q?Mattias_R=C3=B6nnblom?= In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC35E9F780@smartserver.smartshare.dk> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Virus-Scanned: ClamAV using ClamSMTP X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On 2024-10-06 15:43, Morten Brørup wrote: >> From: Mattias Rönnblom [mailto:hofors@lysator.liu.se] >> Sent: Sunday, 6 October 2024 15.03 >> >> On 2024-10-03 23:32, Morten Brørup wrote: >>>> From: Stephen Hemminger [mailto:stephen@networkplumber.org] >>>> Sent: Thursday, 3 October 2024 20.37 >>>> >>>> On Wed, 15 Mar 2023 18:03:40 +0100 >>>> Mattias Rönnblom wrote: >>>> >>>>> This patchset is an attempt to introduce a high-performance, highly >>>>> scalable timer facility into DPDK. >>>>> >>>>> More specifically, the goals for the htimer library are: >>>>> >>>>> * Efficient handling of a handful up to hundreds of thousands of >>>>> concurrent timers. >>>>> * Make adding and canceling timers low-overhead, constant-time >>>>> operations. >>>>> * Provide a service functionally equivalent to that of >>>>> . API/ABI backward compatibility is secondary. >>>> >>>> Worthwhile goals, and the problem needs to be addressed. >>>> But this patch never got accepted. >>> >>> I think work on it was put on hold due to the requested changes >> requiring a significant development effort. >>> I too look forward to work on this being resumed. ;-) >>> >>>> >>>> Please fix/improve/extend existing rte_timer instead. >>> >>> The rte_timer API is too "fat" for use in the fast path with millions >> of timers, e.g. TCP flow timers. >>> >>> Shoehorning a fast path feature into a slow path API is not going to >> cut it. I support having a separate htimer library with its own API for >> high volume, high-performance fast path timers. >>> >>> When striving for low latency across the internet, timing is >> everything. Packet pacing is the "new" hot thing in congestion control >> algorithms, and a simple software implementation would require a timer >> firing once per packet. >>> >> >> I think DPDK should have two public APIs in the timer area. > > Agree. > >> One is a >> just a bare-bones hierarchical timer wheel API, without callbacks, >> auto-created per-lcore instances, MT safety or any other of the >> bells and whistles. It also doesn't make any assumptions >> about the time source (other it being monotonic) or resolution. > > The library does not - and is never going to - provide sufficient performance for timer intensive applications, such as packet pacing and fast path TCP/QUIC/whatever congestion control. It is too "fat" for this. > > We need a new library with a new API for that. > I agree with Mattias' description of the requirements for such a library. > >> >> The other is a new variant of , using the core HTW library >> for its implementation (and being public, it may also expose this >> library in its header files, which may be required for efficient >> operation). The new would provide the same kind of >> functionality as the old API, but with some quirks and bugs fixed, plus >> potentially some new functionality added. For example, it would be >> useful to allow non-preemption safe threads to add and remove timers >> (something rte_timer and its spinlocks doesn't allow). > > Agree. > > Until that becomes part of DPDK, we will have to stick with what currently offers. > >> >> I would consider both "fast path APIs". >> >> In addition, there should probably also be a time source API. > > A third library, orthogonal to the two other timer libraries. > But I see why you mention it: It could be somewhat related to the design and implementation of the library. > But, let's please forget about a time source API for now. > >> >> Considering the lead time of relatively small contributions like the >> bitops extensions and the new bitset API (which still aren't in), I >> can't imagine how long time it would take to get in a semi-backward >> compatible rte_timer with a new implementation, plus a new timer wheel >> library, into DPDK. > > Well said! > > Instead of aiming for an unreachable target, let's instead take this approach: > - Provide the new high-performance HTW library as a stand-alone library. > - Postpone improving the library; it can be done any time in the future, if someone cares to do it. And it can use the HTW library or not, whichever is appropriate. > > Doing both simultaneously would require a substantial effort, and would cause much backpressure from the community (due to the modified API and implementation). > > Although it might be beneficial for the design of the HTW library to consider how an improved would use it, it is not the primary use case of the HTW library, so co-design is not a requirement here. > Postponing rte_timer improvements would also mean postponing most of the benefits of the new timer wheel, in my opinion. In most scenarios, I think you want to have all application modules sharing timer wheel instances, preferably without having to agree on a proprietary timer API. Here rte_timer shines. Also, you want to get the HTW library *exactly* right for the rte_timer use case. Making it a public API would make changes to its API painful, to address any shortcomings you accidentally designed in. To be on the safe side, you would need to have a new rte_timer implementation ready upon submitting a HTW library. That in turn would require a techboard ACK on the necessity of rte_timer API tweaks, otherwise all your work may be wasted.