DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Mattias Rönnblom" <hofors@lysator.liu.se>
To: "Morten Brørup" <mb@smartsharesystems.com>,
	"Stephen Hemminger" <stephen@networkplumber.org>,
	"Mattias Rönnblom" <mattias.ronnblom@ericsson.com>
Cc: dev@dpdk.org, Erik Gabriel Carrillo <erik.g.carrillo@intel.com>,
	David Marchand <david.marchand@redhat.com>,
	maria.lingemark@ericsson.com,
	Stefan Sundkvist <stefan.sundkvist@ericsson.com>,
	Tyler Retzlaff <roretzla@linux.microsoft.com>
Subject: Re: [RFC v2 0/2] Add high-performance timer facility
Date: Sun, 6 Oct 2024 16:43:37 +0200	[thread overview]
Message-ID: <4314ffce-38f1-4b0a-8673-55d201e20002@lysator.liu.se> (raw)
In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC35E9F780@smartserver.smartshare.dk>

On 2024-10-06 15:43, Morten Brørup wrote:
>> From: Mattias Rönnblom [mailto:hofors@lysator.liu.se]
>> Sent: Sunday, 6 October 2024 15.03
>>
>> On 2024-10-03 23:32, Morten Brørup wrote:
>>>> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
>>>> Sent: Thursday, 3 October 2024 20.37
>>>>
>>>> On Wed, 15 Mar 2023 18:03:40 +0100
>>>> Mattias Rönnblom <mattias.ronnblom@ericsson.com> wrote:
>>>>
>>>>> This patchset is an attempt to introduce a high-performance, highly
>>>>> scalable timer facility into DPDK.
>>>>>
>>>>> More specifically, the goals for the htimer library are:
>>>>>
>>>>> * Efficient handling of a handful up to hundreds of thousands of
>>>>>     concurrent timers.
>>>>> * Make adding and canceling timers low-overhead, constant-time
>>>>>     operations.
>>>>> * Provide a service functionally equivalent to that of
>>>>>     <rte_timer.h>. API/ABI backward compatibility is secondary.
>>>>
>>>> Worthwhile goals, and the problem needs to be addressed.
>>>> But this patch never got accepted.
>>>
>>> I think work on it was put on hold due to the requested changes
>> requiring a significant development effort.
>>> I too look forward to work on this being resumed. ;-)
>>>
>>>>
>>>> Please fix/improve/extend existing rte_timer instead.
>>>
>>> The rte_timer API is too "fat" for use in the fast path with millions
>> of timers, e.g. TCP flow timers.
>>>
>>> Shoehorning a fast path feature into a slow path API is not going to
>> cut it. I support having a separate htimer library with its own API for
>> high volume, high-performance fast path timers.
>>>
>>> When striving for low latency across the internet, timing is
>> everything. Packet pacing is the "new" hot thing in congestion control
>> algorithms, and a simple software implementation would require a timer
>> firing once per packet.
>>>
>>
>> I think DPDK should have two public APIs in the timer area.
> 
> Agree.
> 
>> One is a
>> just a bare-bones hierarchical timer wheel API, without callbacks,
>> auto-created per-lcore instances, MT safety or any other of the
>> <rte_timer.h> bells and whistles. It also doesn't make any assumptions
>> about the time source (other it being monotonic) or resolution.
> 
> The <rte_timer.h> library does not - and is never going to - provide sufficient performance for timer intensive applications, such as packet pacing and fast path TCP/QUIC/whatever congestion control. It is too "fat" for this.
> 
> We need a new library with a new API for that.
> I agree with Mattias' description of the requirements for such a library.
> 
>>
>> The other is a new variant of <rte_timer.h>, using the core HTW library
>> for its implementation (and being public, it may also expose this
>> library in its header files, which may be required for efficient
>> operation). The new <rte_timer.h> would provide the same kind of
>> functionality as the old API, but with some quirks and bugs fixed, plus
>> potentially some new functionality added. For example, it would be
>> useful to allow non-preemption safe threads to add and remove timers
>> (something rte_timer and its spinlocks doesn't allow).
> 
> Agree.
> 
> Until that becomes part of DPDK, we will have to stick with what <rte_timer.h> currently offers.
> 
>>
>> I would consider both "fast path APIs".
>>
>> In addition, there should probably also be a time source API.
> 
> A third library, orthogonal to the two other timer libraries.
> But I see why you mention it: It could be somewhat related to the design and implementation of the <rte_timer.h> library.
> But, let's please forget about a time source API for now.
> 
>>
>> Considering the lead time of relatively small contributions like the
>> bitops extensions and the new bitset API (which still aren't in), I
>> can't imagine how long time it would take to get in a semi-backward
>> compatible rte_timer with a new implementation, plus a new timer wheel
>> library, into DPDK.
> 
> Well said!
> 
> Instead of aiming for an unreachable target, let's instead take this approach:
> - Provide the new high-performance HTW library as a stand-alone library.
> - Postpone improving the <rte_timer.h> library; it can be done any time in the future, if someone cares to do it. And it can use the HTW library or not, whichever is appropriate.
> 
> Doing both simultaneously would require a substantial effort, and would cause much backpressure from the community (due to the modified <rte_timer.h> API and implementation).
> 
> Although it might be beneficial for the design of the HTW library to consider how an improved <rte_timer.h> would use it, it is not the primary use case of the HTW library, so co-design is not a requirement here.
> 

Postponing rte_timer improvements would also mean postponing most of the 
benefits of the new timer wheel, in my opinion.

In most scenarios, I think you want to have all application modules 
sharing timer wheel instances, preferably without having to agree on a 
proprietary timer API. Here rte_timer shines.

Also, you want to get the HTW library *exactly* right for the rte_timer 
use case. Making it a public API would make changes to its API painful, 
to address any shortcomings you accidentally designed in. To be on the 
safe side, you would need to have a new rte_timer implementation ready 
upon submitting a HTW library.

That in turn would require a techboard ACK on the necessity of rte_timer 
API tweaks, otherwise all your work may be wasted.


      reply	other threads:[~2024-10-06 14:43 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-28  9:39 [RFC " Mattias Rönnblom
2023-02-28  9:39 ` [RFC 1/2] eal: add bitset type Mattias Rönnblom
2023-02-28 18:46   ` Tyler Retzlaff
2023-03-02  6:31     ` Mattias Rönnblom
2023-03-02 20:39       ` Tyler Retzlaff
2023-02-28  9:39 ` [RFC 2/2] eal: add high-performance timer facility Mattias Rönnblom
2023-03-05 17:25   ` Stephen Hemminger
2023-03-09 15:20     ` Mattias Rönnblom
2023-02-28 16:01 ` [RFC 0/2] Add " Morten Brørup
2023-03-01 11:18   ` Mattias Rönnblom
2023-03-01 13:31     ` Morten Brørup
2023-03-01 15:50       ` Mattias Rönnblom
2023-03-01 17:06         ` Morten Brørup
2023-03-15 17:03 ` [RFC v2 " Mattias Rönnblom
2023-03-15 17:03   ` [RFC v2 1/2] eal: add bitset type Mattias Rönnblom
2023-03-15 17:20     ` Stephen Hemminger
2023-03-15 18:27       ` Mattias Rönnblom
2023-03-15 17:03   ` [RFC v2 2/2] eal: add high-performance timer facility Mattias Rönnblom
2023-03-16  3:55     ` Tyler Retzlaff
2023-03-17  1:58     ` Stephen Hemminger
2023-03-22 12:18     ` Morten Brørup
2023-04-03 12:04       ` Mattias Rönnblom
2023-04-04  7:32         ` Morten Brørup
2023-03-24 16:00     ` Morten Brørup
2023-07-06 22:41     ` Stephen Hemminger
2023-07-12  8:58       ` Mattias Rönnblom
2024-10-03 18:36   ` [RFC v2 0/2] Add " Stephen Hemminger
2024-10-03 21:32     ` Morten Brørup
2024-10-06 13:02       ` Mattias Rönnblom
2024-10-06 13:43         ` Morten Brørup
2024-10-06 14:43           ` Mattias Rönnblom [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4314ffce-38f1-4b0a-8673-55d201e20002@lysator.liu.se \
    --to=hofors@lysator.liu.se \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=erik.g.carrillo@intel.com \
    --cc=maria.lingemark@ericsson.com \
    --cc=mattias.ronnblom@ericsson.com \
    --cc=mb@smartsharesystems.com \
    --cc=roretzla@linux.microsoft.com \
    --cc=stefan.sundkvist@ericsson.com \
    --cc=stephen@networkplumber.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).