From: "Mattias Rönnblom" <hofors@lysator.liu.se>
To: "Morten Brørup" <mb@smartsharesystems.com>,
"Stephen Hemminger" <stephen@networkplumber.org>,
"Mattias Rönnblom" <mattias.ronnblom@ericsson.com>
Cc: dev@dpdk.org, Erik Gabriel Carrillo <erik.g.carrillo@intel.com>,
David Marchand <david.marchand@redhat.com>,
maria.lingemark@ericsson.com,
Stefan Sundkvist <stefan.sundkvist@ericsson.com>,
Tyler Retzlaff <roretzla@linux.microsoft.com>
Subject: Re: [RFC v2 0/2] Add high-performance timer facility
Date: Sun, 6 Oct 2024 16:43:37 +0200 [thread overview]
Message-ID: <4314ffce-38f1-4b0a-8673-55d201e20002@lysator.liu.se> (raw)
In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC35E9F780@smartserver.smartshare.dk>
On 2024-10-06 15:43, Morten Brørup wrote:
>> From: Mattias Rönnblom [mailto:hofors@lysator.liu.se]
>> Sent: Sunday, 6 October 2024 15.03
>>
>> On 2024-10-03 23:32, Morten Brørup wrote:
>>>> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
>>>> Sent: Thursday, 3 October 2024 20.37
>>>>
>>>> On Wed, 15 Mar 2023 18:03:40 +0100
>>>> Mattias Rönnblom <mattias.ronnblom@ericsson.com> wrote:
>>>>
>>>>> This patchset is an attempt to introduce a high-performance, highly
>>>>> scalable timer facility into DPDK.
>>>>>
>>>>> More specifically, the goals for the htimer library are:
>>>>>
>>>>> * Efficient handling of a handful up to hundreds of thousands of
>>>>> concurrent timers.
>>>>> * Make adding and canceling timers low-overhead, constant-time
>>>>> operations.
>>>>> * Provide a service functionally equivalent to that of
>>>>> <rte_timer.h>. API/ABI backward compatibility is secondary.
>>>>
>>>> Worthwhile goals, and the problem needs to be addressed.
>>>> But this patch never got accepted.
>>>
>>> I think work on it was put on hold due to the requested changes
>> requiring a significant development effort.
>>> I too look forward to work on this being resumed. ;-)
>>>
>>>>
>>>> Please fix/improve/extend existing rte_timer instead.
>>>
>>> The rte_timer API is too "fat" for use in the fast path with millions
>> of timers, e.g. TCP flow timers.
>>>
>>> Shoehorning a fast path feature into a slow path API is not going to
>> cut it. I support having a separate htimer library with its own API for
>> high volume, high-performance fast path timers.
>>>
>>> When striving for low latency across the internet, timing is
>> everything. Packet pacing is the "new" hot thing in congestion control
>> algorithms, and a simple software implementation would require a timer
>> firing once per packet.
>>>
>>
>> I think DPDK should have two public APIs in the timer area.
>
> Agree.
>
>> One is a
>> just a bare-bones hierarchical timer wheel API, without callbacks,
>> auto-created per-lcore instances, MT safety or any other of the
>> <rte_timer.h> bells and whistles. It also doesn't make any assumptions
>> about the time source (other it being monotonic) or resolution.
>
> The <rte_timer.h> library does not - and is never going to - provide sufficient performance for timer intensive applications, such as packet pacing and fast path TCP/QUIC/whatever congestion control. It is too "fat" for this.
>
> We need a new library with a new API for that.
> I agree with Mattias' description of the requirements for such a library.
>
>>
>> The other is a new variant of <rte_timer.h>, using the core HTW library
>> for its implementation (and being public, it may also expose this
>> library in its header files, which may be required for efficient
>> operation). The new <rte_timer.h> would provide the same kind of
>> functionality as the old API, but with some quirks and bugs fixed, plus
>> potentially some new functionality added. For example, it would be
>> useful to allow non-preemption safe threads to add and remove timers
>> (something rte_timer and its spinlocks doesn't allow).
>
> Agree.
>
> Until that becomes part of DPDK, we will have to stick with what <rte_timer.h> currently offers.
>
>>
>> I would consider both "fast path APIs".
>>
>> In addition, there should probably also be a time source API.
>
> A third library, orthogonal to the two other timer libraries.
> But I see why you mention it: It could be somewhat related to the design and implementation of the <rte_timer.h> library.
> But, let's please forget about a time source API for now.
>
>>
>> Considering the lead time of relatively small contributions like the
>> bitops extensions and the new bitset API (which still aren't in), I
>> can't imagine how long time it would take to get in a semi-backward
>> compatible rte_timer with a new implementation, plus a new timer wheel
>> library, into DPDK.
>
> Well said!
>
> Instead of aiming for an unreachable target, let's instead take this approach:
> - Provide the new high-performance HTW library as a stand-alone library.
> - Postpone improving the <rte_timer.h> library; it can be done any time in the future, if someone cares to do it. And it can use the HTW library or not, whichever is appropriate.
>
> Doing both simultaneously would require a substantial effort, and would cause much backpressure from the community (due to the modified <rte_timer.h> API and implementation).
>
> Although it might be beneficial for the design of the HTW library to consider how an improved <rte_timer.h> would use it, it is not the primary use case of the HTW library, so co-design is not a requirement here.
>
Postponing rte_timer improvements would also mean postponing most of the
benefits of the new timer wheel, in my opinion.
In most scenarios, I think you want to have all application modules
sharing timer wheel instances, preferably without having to agree on a
proprietary timer API. Here rte_timer shines.
Also, you want to get the HTW library *exactly* right for the rte_timer
use case. Making it a public API would make changes to its API painful,
to address any shortcomings you accidentally designed in. To be on the
safe side, you would need to have a new rte_timer implementation ready
upon submitting a HTW library.
That in turn would require a techboard ACK on the necessity of rte_timer
API tweaks, otherwise all your work may be wasted.
prev parent reply other threads:[~2024-10-06 14:43 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-28 9:39 [RFC " Mattias Rönnblom
2023-02-28 9:39 ` [RFC 1/2] eal: add bitset type Mattias Rönnblom
2023-02-28 18:46 ` Tyler Retzlaff
2023-03-02 6:31 ` Mattias Rönnblom
2023-03-02 20:39 ` Tyler Retzlaff
2023-02-28 9:39 ` [RFC 2/2] eal: add high-performance timer facility Mattias Rönnblom
2023-03-05 17:25 ` Stephen Hemminger
2023-03-09 15:20 ` Mattias Rönnblom
2023-02-28 16:01 ` [RFC 0/2] Add " Morten Brørup
2023-03-01 11:18 ` Mattias Rönnblom
2023-03-01 13:31 ` Morten Brørup
2023-03-01 15:50 ` Mattias Rönnblom
2023-03-01 17:06 ` Morten Brørup
2023-03-15 17:03 ` [RFC v2 " Mattias Rönnblom
2023-03-15 17:03 ` [RFC v2 1/2] eal: add bitset type Mattias Rönnblom
2023-03-15 17:20 ` Stephen Hemminger
2023-03-15 18:27 ` Mattias Rönnblom
2023-03-15 17:03 ` [RFC v2 2/2] eal: add high-performance timer facility Mattias Rönnblom
2023-03-16 3:55 ` Tyler Retzlaff
2023-03-17 1:58 ` Stephen Hemminger
2023-03-22 12:18 ` Morten Brørup
2023-04-03 12:04 ` Mattias Rönnblom
2023-04-04 7:32 ` Morten Brørup
2023-03-24 16:00 ` Morten Brørup
2023-07-06 22:41 ` Stephen Hemminger
2023-07-12 8:58 ` Mattias Rönnblom
2024-10-03 18:36 ` [RFC v2 0/2] Add " Stephen Hemminger
2024-10-03 21:32 ` Morten Brørup
2024-10-06 13:02 ` Mattias Rönnblom
2024-10-06 13:43 ` Morten Brørup
2024-10-06 14:43 ` Mattias Rönnblom [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4314ffce-38f1-4b0a-8673-55d201e20002@lysator.liu.se \
--to=hofors@lysator.liu.se \
--cc=david.marchand@redhat.com \
--cc=dev@dpdk.org \
--cc=erik.g.carrillo@intel.com \
--cc=maria.lingemark@ericsson.com \
--cc=mattias.ronnblom@ericsson.com \
--cc=mb@smartsharesystems.com \
--cc=roretzla@linux.microsoft.com \
--cc=stefan.sundkvist@ericsson.com \
--cc=stephen@networkplumber.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).