From: "Morten Brørup" <mb@smartsharesystems.com>
To: "Mattias Rönnblom" <mattias.ronnblom@ericsson.com>, dev@dpdk.org
Cc: "Erik Gabriel Carrillo" <erik.g.carrillo@intel.com>,
"David Marchand" <david.marchand@redhat.com>,
<maria.lingemark@ericsson.com>,
"Stefan Sundkvist" <stefan.sundkvist@ericsson.com>,
"Stephen Hemminger" <stephen@networkplumber.org>,
"Tyler Retzlaff" <roretzla@linux.microsoft.com>
Subject: RE: [RFC v2 2/2] eal: add high-performance timer facility
Date: Fri, 24 Mar 2023 17:00:07 +0100 [thread overview]
Message-ID: <98CBD80474FA8B44BF855DF32C47DC35D87809@smartserver.smartshare.dk> (raw)
In-Reply-To: <20230315170342.214127-3-mattias.ronnblom@ericsson.com>
> From: Mattias Rönnblom [mailto:mattias.ronnblom@ericsson.com]
> Sent: Wednesday, 15 March 2023 18.04
>
> The htimer library attempts at providing a timer facility with roughly
> the same functionality, but less overhead and better scalability than
> DPDK timer library.
>
> The htimer library employs per-lcore hierarchical timer wheels and a
> message-based synchronization/MT-safety scheme.
>
> RFC v2:
> * Fix spelling.
> * Fix signed/unsigned comparisons and discontinue the use of name-less
> function parameters, both of which may result in compiler warnings.
> * Undo the accidental removal of the bitset tests from the 'fast_tests'.
> * Add a number of missing include files, causing build failures
> (e.g., on AArch64 builds).
> * Add perf test attempting to compare rte_timer, rte_htimer and rte_htw.
> * Use nanoseconds (instead of TSC) as the default time unit.
> * add() and manage() has flags which allows the caller to specify the
> time unit (nanoseconds, TSC, or ticks) for the times provided.
>
> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> ---
Two more series of comments, see inline below:
1. Arguing for using "tick" as the default unit of time.
2. Some bugs in the time conversion functions.
[...]
> diff --git a/lib/htimer/meson.build b/lib/htimer/meson.build
> new file mode 100644
> index 0000000000..2dd5d6a24b
> --- /dev/null
> +++ b/lib/htimer/meson.build
> @@ -0,0 +1,7 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(c) 2023 Ericsson AB
> +
> +sources = files('rte_htw.c', 'rte_htimer_msg_ring.c', 'rte_htimer_mgr.c')
> +headers = files('rte_htimer_mgr.h', 'rte_htimer.h')
> +
> +deps += ['ring']
> diff --git a/lib/htimer/rte_htimer.h b/lib/htimer/rte_htimer.h
> new file mode 100644
> index 0000000000..6ac86292b5
> --- /dev/null
> +++ b/lib/htimer/rte_htimer.h
> @@ -0,0 +1,68 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2023 Ericsson AB
> + */
> +
> +#ifndef _RTE_HTIMER_H_
> +#define _RTE_HTIMER_H_
> +
> +#include <stdbool.h>
> +#include <stdint.h>
> +#include <sys/queue.h>
> +
> +#include <rte_bitops.h>
> +
> +struct rte_htimer;
> +
> +typedef void (*rte_htimer_cb_t)(struct rte_htimer *, void *);
> +
> +struct rte_htimer {
> + /**
> + * Absolute timer expiration time (in ticks).
> + */
> + uint64_t expiration_time;
> + /**
> + * Time between expirations (in ticks). Zero for one-shot timers.
> + */
> + uint64_t period;
> + /**
> + * Owning lcore. May safely be read from any thread.
> + */
> + uint32_t owner_lcore_id;
> + /**
> + * The current state of the timer.
> + */
> + uint32_t state:4;
> + /**
> + * Flags set on this timer.
> + */
> + uint32_t flags:28;
> + /**
> + * User-specified callback function pointer.
> + */
> + rte_htimer_cb_t cb;
> + /**
> + * Argument for user callback.
> + */
> + void *cb_arg;
> + /**
> + * Pointers used to add timer to various internal lists.
> + */
> + LIST_ENTRY(rte_htimer) entry;
> +};
> +
> +#define RTE_HTIMER_FLAG_ABSOLUTE_TIME RTE_BIT32(0)
> +#define RTE_HTIMER_FLAG_PERIODICAL RTE_BIT32(1)
> +#define RTE_HTIMER_FLAG_TIME_TICK RTE_BIT32(2)
> +#define RTE_HTIMER_FLAG_TIME_TSC RTE_BIT32(3)
After further consideration, and taking the time conversion functions into account, I think the default unit of time should be "tick", not nanoseconds. It seems more natural, and might offer more flexibility in the future.
So instead of:
+#define RTE_HTIMER_FLAG_TIME_TICK RTE_BIT32(2)
+#define RTE_HTIMER_FLAG_TIME_TSC RTE_BIT32(3)
then:
+#define RTE_HTIMER_FLAG_TIME_TSC RTE_BIT32(2)
+#define RTE_HTIMER_FLAG_TIME_NS RTE_BIT32(3)
and perhaps in the future:
+#define RTE_HTIMER_FLAG_TIME_US RTE_BIT32(4)
> +
> +#define RTE_HTIMER_STATE_PENDING 1
> +#define RTE_HTIMER_STATE_EXPIRED 2
> +#define RTE_HTIMER_STATE_CANCELED 3
> +
> +LIST_HEAD(rte_htimer_list, rte_htimer);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_HTIMER_H_ */
> diff --git a/lib/htimer/rte_htimer_mgr.c b/lib/htimer/rte_htimer_mgr.c
> new file mode 100644
> index 0000000000..efdfcf0985
> --- /dev/null
> +++ b/lib/htimer/rte_htimer_mgr.c
> @@ -0,0 +1,547 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2023 Ericsson AB
> + */
> +
> +#include <inttypes.h>
> +#include <math.h>
> +#include <stdbool.h>
> +#include <sys/queue.h>
> +#include <unistd.h>
> +
> +#include <rte_branch_prediction.h>
> +#include <rte_common.h>
> +#include <rte_cycles.h>
> +#include <rte_errno.h>
> +#include <rte_htw.h>
> +#include <rte_prefetch.h>
> +#include <rte_ring_elem.h>
> +
> +#include "rte_htimer_mgr.h"
> +#include "rte_htimer_msg.h"
> +#include "rte_htimer_msg_ring.h"
> +
> +#define MAX_MSG_BATCH_SIZE 16
> +
> +struct htimer_mgr {
> + struct rte_htimer_msg_ring *msg_ring;
> + struct rte_htw *htw;
> +
> + unsigned int async_msgs_idx __rte_cache_aligned;
> + unsigned int num_async_msgs;
> + struct rte_htimer_msg async_msgs[MAX_MSG_BATCH_SIZE];
> +} __rte_cache_aligned;
> +
> +static uint64_t ns_per_tick;
> +static double tsc_per_tick;
> +
> +static struct htimer_mgr mgrs[RTE_MAX_LCORE + 1];
> +
> +#define MAX_ASYNC_TRANSACTIONS 1024
> +#define MSG_RING_SIZE MAX_ASYNC_TRANSACTIONS
> +
> +static inline uint64_t
> +tsc_to_tick(uint64_t tsc)
> +{
> + return tsc / tsc_per_tick;
> +}
> +
> +static inline uint64_t
> +tsc_to_tick_round_up(uint64_t tsc)
> +{
> + uint64_t tick;
> +
> + tick = (tsc + tsc_per_tick / 2) / tsc_per_tick;
This does not round up, it rounds off.
E.g. tsc_per_tick=10.0, tsc=1 becomes (1 + 5.0) / 10.0 = 0.6, which becomes 0 (when converted to integer).
E.g. tsc_per_tick=10.0, tsc=5 becomes (5 + 5.0) / 10.0 = 1.0, which becomes 1.
> +
> + return tick;
> +}
> +
> +static inline uint64_t
> +ns_to_tick(uint64_t ns)
> +{
> + return ns / ns_per_tick;
> +}
> +
> +static inline uint64_t
> +ns_to_tick_round_up(uint64_t ns)
> +{
> + uint64_t tick;
> +
> + tick = ceil(ns / ns_per_tick);
ns_per_tick is integer, not floating point, so the division is performed as integer division, and ceil() has no effect; i.e. the above is the same as:
tick = ns / ns_per_tick;
Which also means that it does not round up.
> +
> + return tick;
> +}
> +
> +static inline uint64_t
> +tick_to_ns(uint64_t tick)
> +{
> + return tick * ns_per_tick;
> +}
> +
> +static struct htimer_mgr *
> +mgr_get(unsigned int lcore_id)
> +{
> + return &mgrs[lcore_id];
> +}
> +
> +static int
> +mgr_init(unsigned int lcore_id)
> +{
> + char ring_name[RTE_RING_NAMESIZE];
> + unsigned int socket_id;
> + struct htimer_mgr *mgr = &mgrs[lcore_id];
> +
> + socket_id = rte_lcore_to_socket_id(lcore_id);
> +
> + snprintf(ring_name, sizeof(ring_name), "htimer_%d", lcore_id);
> +
> + mgr->msg_ring =
> + rte_htimer_msg_ring_create(ring_name, MSG_RING_SIZE, socket_id,
> + RING_F_SC_DEQ);
> +
> + if (mgr->msg_ring == NULL)
> + goto err;
> +
> + mgr->htw = rte_htw_create();
> +
> + if (mgr->htw == NULL)
> + goto err_free_ring;
> +
> + mgr->async_msgs_idx = 0;
> + mgr->num_async_msgs = 0;
> +
> + return 0;
> +
> +err_free_ring:
> + rte_htimer_msg_ring_free(mgr->msg_ring);
> +err:
> + return -ENOMEM;
> +}
> +
> +static void
> +mgr_deinit(unsigned int lcore_id)
> +{
> + struct htimer_mgr *mgr = &mgrs[lcore_id];
> +
> + rte_htw_destroy(mgr->htw);
> +
> + rte_htimer_msg_ring_free(mgr->msg_ring);
> +}
> +
> +static volatile bool initialized;
> +
> +static void
> +assure_initialized(void)
> +{
> + RTE_ASSERT(initialized);
> +}
> +
> +int
> +rte_htimer_mgr_init(uint64_t _ns_per_tick)
> +{
> + unsigned int lcore_id;
> +
> + RTE_VERIFY(!initialized);
> +
> + ns_per_tick = _ns_per_tick;
> +
> + tsc_per_tick = (ns_per_tick / 1e9) * rte_get_tsc_hz();
> +
> + for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
> + int rc;
> +
> + rc = mgr_init(lcore_id);
> +
> + if (rc < 0) {
> + unsigned int deinit_lcore_id;
> +
> + for (deinit_lcore_id = 0; deinit_lcore_id < lcore_id;
> + deinit_lcore_id++)
> + mgr_deinit(deinit_lcore_id);
> +
> + return rc;
> + }
> + }
> +
> + initialized = true;
> +
> + return 0;
> +}
> +
> +void
> +rte_htimer_mgr_deinit(void)
> +{
> + unsigned int lcore_id;
> +
> + assure_initialized();
> +
> + for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++)
> + mgr_deinit(lcore_id);
> +
> + initialized = false;
> +}
> +
> +static void
> +assure_valid_time_conversion_flags(uint32_t flags __rte_unused)
> +{
> + RTE_ASSERT(!((flags & RTE_HTIMER_FLAG_TIME_TSC) &&
> + (flags & RTE_HTIMER_FLAG_TIME_TICK)));
With my above suggestion of using tick as default time unit, this would be changed to:
+ RTE_ASSERT(!((flags & RTE_HTIMER_FLAG_TIME_TSC) &&
+ (flags & RTE_HTIMER_FLAG_TIME_NS) &&
+ (flags & RTE_HTIMER_FLAG_TIME_US)));
> +}
> +
> +static void
> +assure_valid_add_flags(uint32_t flags)
> +{
> + assure_valid_time_conversion_flags(flags);
> +
> + RTE_ASSERT(!(flags & ~(RTE_HTIMER_FLAG_PERIODICAL |
> + RTE_HTIMER_FLAG_ABSOLUTE_TIME |
> + RTE_HTIMER_FLAG_TIME_TSC |
> + RTE_HTIMER_FLAG_TIME_TICK)));
With my above suggestion of using tick as default time unit, this would be changed to:
+ RTE_ASSERT(!(flags & ~(RTE_HTIMER_FLAG_PERIODICAL |
+ RTE_HTIMER_FLAG_ABSOLUTE_TIME |
+ RTE_HTIMER_FLAG_TIME_TSC |
+ RTE_HTIMER_FLAG_TIME_NS |
+ RTE_HTIMER_FLAG_TIME_US)));
> +}
> +
> +static uint64_t
> +convert_time(uint64_t t, uint32_t flags)
> +{
> + if (flags & RTE_HTIMER_FLAG_TIME_TSC)
> + return tsc_to_tick(t);
> + else if (flags & RTE_HTIMER_FLAG_TIME_TICK)
> + return t;
> + else
> + return ns_to_tick(t);
With my above suggestion of using tick as default time unit, this would be changed to:
+ if (flags & RTE_HTIMER_FLAG_TIME_TSC)
+ return tsc_to_tick(t);
+ else if (flags & RTE_HTIMER_FLAG_TIME_NS)
+ return ns_to_tick(t);
+ else if (flags & RTE_HTIMER_FLAG_TIME_US)
+ return us_to_tick(t);
+ else
+ return t;
> +}
> +
> +void
> +rte_htimer_mgr_add(struct rte_htimer *timer, uint64_t expiration_time,
> + uint64_t period, rte_htimer_cb_t timer_cb,
> + void *timer_cb_arg, uint32_t flags)
> +{
> + unsigned int lcore_id = rte_lcore_id();
> + struct htimer_mgr *mgr = mgr_get(lcore_id);
> + uint64_t expiration_time_tick;
> + uint64_t period_tick;
> +
> + assure_initialized();
> +
> + assure_valid_add_flags(flags);
> +
> + expiration_time_tick = convert_time(expiration_time, flags);
> +
> + period_tick = convert_time(period, flags);
> +
> + rte_htw_add(mgr->htw, timer, expiration_time_tick, period_tick,
> + timer_cb, timer_cb_arg, flags);
> +
> + timer->owner_lcore_id = lcore_id;
> +}
next prev parent reply other threads:[~2023-03-24 16:00 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-28 9:39 [RFC 0/2] Add " Mattias Rönnblom
2023-02-28 9:39 ` [RFC 1/2] eal: add bitset type Mattias Rönnblom
2023-02-28 18:46 ` Tyler Retzlaff
2023-03-02 6:31 ` Mattias Rönnblom
2023-03-02 20:39 ` Tyler Retzlaff
2023-02-28 9:39 ` [RFC 2/2] eal: add high-performance timer facility Mattias Rönnblom
2023-03-05 17:25 ` Stephen Hemminger
2023-03-09 15:20 ` Mattias Rönnblom
2023-02-28 16:01 ` [RFC 0/2] Add " Morten Brørup
2023-03-01 11:18 ` Mattias Rönnblom
2023-03-01 13:31 ` Morten Brørup
2023-03-01 15:50 ` Mattias Rönnblom
2023-03-01 17:06 ` Morten Brørup
2023-03-15 17:03 ` [RFC v2 " Mattias Rönnblom
2023-03-15 17:03 ` [RFC v2 1/2] eal: add bitset type Mattias Rönnblom
2023-03-15 17:20 ` Stephen Hemminger
2023-03-15 18:27 ` Mattias Rönnblom
2023-03-15 17:03 ` [RFC v2 2/2] eal: add high-performance timer facility Mattias Rönnblom
2023-03-16 3:55 ` Tyler Retzlaff
2023-03-17 1:58 ` Stephen Hemminger
2023-03-22 12:18 ` Morten Brørup
2023-04-03 12:04 ` Mattias Rönnblom
2023-04-04 7:32 ` Morten Brørup
2023-03-24 16:00 ` Morten Brørup [this message]
2023-07-06 22:41 ` Stephen Hemminger
2023-07-12 8:58 ` Mattias Rönnblom
2024-10-03 18:36 ` [RFC v2 0/2] Add " Stephen Hemminger
2024-10-03 21:32 ` Morten Brørup
2024-10-06 13:02 ` Mattias Rönnblom
2024-10-06 13:43 ` Morten Brørup
2024-10-06 14:43 ` Mattias Rönnblom
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=98CBD80474FA8B44BF855DF32C47DC35D87809@smartserver.smartshare.dk \
--to=mb@smartsharesystems.com \
--cc=david.marchand@redhat.com \
--cc=dev@dpdk.org \
--cc=erik.g.carrillo@intel.com \
--cc=maria.lingemark@ericsson.com \
--cc=mattias.ronnblom@ericsson.com \
--cc=roretzla@linux.microsoft.com \
--cc=stefan.sundkvist@ericsson.com \
--cc=stephen@networkplumber.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).