DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Morten Brørup" <mb@smartsharesystems.com>
To: "Mattias Rönnblom" <mattias.ronnblom@ericsson.com>, dev@dpdk.org
Cc: "Erik Gabriel Carrillo" <erik.g.carrillo@intel.com>,
	"David Marchand" <david.marchand@redhat.com>,
	<maria.lingemark@ericsson.com>,
	"Stefan Sundkvist" <stefan.sundkvist@ericsson.com>,
	"Stephen Hemminger" <stephen@networkplumber.org>,
	"Tyler Retzlaff" <roretzla@linux.microsoft.com>
Subject: RE: [RFC v2 2/2] eal: add high-performance timer facility
Date: Fri, 24 Mar 2023 17:00:07 +0100	[thread overview]
Message-ID: <98CBD80474FA8B44BF855DF32C47DC35D87809@smartserver.smartshare.dk> (raw)
In-Reply-To: <20230315170342.214127-3-mattias.ronnblom@ericsson.com>

> From: Mattias Rönnblom [mailto:mattias.ronnblom@ericsson.com]
> Sent: Wednesday, 15 March 2023 18.04
> 
> The htimer library attempts at providing a timer facility with roughly
> the same functionality, but less overhead and better scalability than
> DPDK timer library.
> 
> The htimer library employs per-lcore hierarchical timer wheels and a
> message-based synchronization/MT-safety scheme.
> 
> RFC v2:
>  * Fix spelling.
>  * Fix signed/unsigned comparisons and discontinue the use of name-less
>    function parameters, both of which may result in compiler warnings.
>  * Undo the accidental removal of the bitset tests from the 'fast_tests'.
>  * Add a number of missing include files, causing build failures
>    (e.g., on AArch64 builds).
>  * Add perf test attempting to compare rte_timer, rte_htimer and rte_htw.
>  * Use nanoseconds (instead of TSC) as the default time unit.
>  * add() and manage() has flags which allows the caller to specify the
>    time unit (nanoseconds, TSC, or ticks) for the times provided.
> 
> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> ---

Two more series of comments, see inline below:
1. Arguing for using "tick" as the default unit of time.
2. Some bugs in the time conversion functions.

[...]

> diff --git a/lib/htimer/meson.build b/lib/htimer/meson.build
> new file mode 100644
> index 0000000000..2dd5d6a24b
> --- /dev/null
> +++ b/lib/htimer/meson.build
> @@ -0,0 +1,7 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(c) 2023 Ericsson AB
> +
> +sources = files('rte_htw.c', 'rte_htimer_msg_ring.c', 'rte_htimer_mgr.c')
> +headers = files('rte_htimer_mgr.h', 'rte_htimer.h')
> +
> +deps += ['ring']
> diff --git a/lib/htimer/rte_htimer.h b/lib/htimer/rte_htimer.h
> new file mode 100644
> index 0000000000..6ac86292b5
> --- /dev/null
> +++ b/lib/htimer/rte_htimer.h
> @@ -0,0 +1,68 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2023 Ericsson AB
> + */
> +
> +#ifndef _RTE_HTIMER_H_
> +#define _RTE_HTIMER_H_
> +
> +#include <stdbool.h>
> +#include <stdint.h>
> +#include <sys/queue.h>
> +
> +#include <rte_bitops.h>
> +
> +struct rte_htimer;
> +
> +typedef void (*rte_htimer_cb_t)(struct rte_htimer *, void *);
> +
> +struct rte_htimer {
> +	/**
> +	 * Absolute timer expiration time (in ticks).
> +	 */
> +	uint64_t expiration_time;
> +	/**
> +	 * Time between expirations (in ticks). Zero for one-shot timers.
> +	 */
> +	uint64_t period;
> +	/**
> +	 * Owning lcore. May safely be read from any thread.
> +	 */
> +	uint32_t owner_lcore_id;
> +	/**
> +	 * The current state of the timer.
> +	 */
> +	uint32_t state:4;
> +	/**
> +	 * Flags set on this timer.
> +	 */
> +	uint32_t flags:28;
> +	/**
> +	 * User-specified callback function pointer.
> +	 */
> +	rte_htimer_cb_t cb;
> +	/**
> +	 * Argument for user callback.
> +	 */
> +	void *cb_arg;
> +	/**
> +	 * Pointers used to add timer to various internal lists.
> +	 */
> +	LIST_ENTRY(rte_htimer) entry;
> +};
> +
> +#define RTE_HTIMER_FLAG_ABSOLUTE_TIME RTE_BIT32(0)
> +#define RTE_HTIMER_FLAG_PERIODICAL RTE_BIT32(1)
> +#define RTE_HTIMER_FLAG_TIME_TICK RTE_BIT32(2)
> +#define RTE_HTIMER_FLAG_TIME_TSC RTE_BIT32(3)

After further consideration, and taking the time conversion functions into account, I think the default unit of time should be "tick", not nanoseconds. It seems more natural, and might offer more flexibility in the future.

So instead of:

+#define RTE_HTIMER_FLAG_TIME_TICK RTE_BIT32(2)
+#define RTE_HTIMER_FLAG_TIME_TSC RTE_BIT32(3)

then:

+#define RTE_HTIMER_FLAG_TIME_TSC RTE_BIT32(2)
+#define RTE_HTIMER_FLAG_TIME_NS RTE_BIT32(3)

and perhaps in the future:

+#define RTE_HTIMER_FLAG_TIME_US RTE_BIT32(4)

> +
> +#define RTE_HTIMER_STATE_PENDING 1
> +#define RTE_HTIMER_STATE_EXPIRED 2
> +#define RTE_HTIMER_STATE_CANCELED 3
> +
> +LIST_HEAD(rte_htimer_list, rte_htimer);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_HTIMER_H_ */
> diff --git a/lib/htimer/rte_htimer_mgr.c b/lib/htimer/rte_htimer_mgr.c
> new file mode 100644
> index 0000000000..efdfcf0985
> --- /dev/null
> +++ b/lib/htimer/rte_htimer_mgr.c
> @@ -0,0 +1,547 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2023 Ericsson AB
> + */
> +
> +#include <inttypes.h>
> +#include <math.h>
> +#include <stdbool.h>
> +#include <sys/queue.h>
> +#include <unistd.h>
> +
> +#include <rte_branch_prediction.h>
> +#include <rte_common.h>
> +#include <rte_cycles.h>
> +#include <rte_errno.h>
> +#include <rte_htw.h>
> +#include <rte_prefetch.h>
> +#include <rte_ring_elem.h>
> +
> +#include "rte_htimer_mgr.h"
> +#include "rte_htimer_msg.h"
> +#include "rte_htimer_msg_ring.h"
> +
> +#define MAX_MSG_BATCH_SIZE 16
> +
> +struct htimer_mgr {
> +	struct rte_htimer_msg_ring *msg_ring;
> +	struct rte_htw *htw;
> +
> +	unsigned int async_msgs_idx __rte_cache_aligned;
> +	unsigned int num_async_msgs;
> +	struct rte_htimer_msg async_msgs[MAX_MSG_BATCH_SIZE];
> +} __rte_cache_aligned;
> +
> +static uint64_t ns_per_tick;
> +static double tsc_per_tick;
> +
> +static struct htimer_mgr mgrs[RTE_MAX_LCORE + 1];
> +
> +#define MAX_ASYNC_TRANSACTIONS 1024
> +#define MSG_RING_SIZE MAX_ASYNC_TRANSACTIONS
> +
> +static inline uint64_t
> +tsc_to_tick(uint64_t tsc)
> +{
> +	return tsc / tsc_per_tick;
> +}
> +
> +static inline uint64_t
> +tsc_to_tick_round_up(uint64_t tsc)
> +{
> +	uint64_t tick;
> +
> +	tick = (tsc + tsc_per_tick / 2) / tsc_per_tick;

This does not round up, it rounds off.

E.g. tsc_per_tick=10.0, tsc=1 becomes (1 + 5.0) / 10.0 = 0.6, which becomes 0 (when converted to integer).

E.g. tsc_per_tick=10.0, tsc=5 becomes (5 + 5.0) / 10.0 = 1.0, which becomes 1.

> +
> +	return tick;
> +}
> +
> +static inline uint64_t
> +ns_to_tick(uint64_t ns)
> +{
> +	return ns / ns_per_tick;
> +}
> +
> +static inline uint64_t
> +ns_to_tick_round_up(uint64_t ns)
> +{
> +	uint64_t tick;
> +
> +	tick = ceil(ns / ns_per_tick);

ns_per_tick is integer, not floating point, so the division is performed as integer division, and ceil() has no effect; i.e. the above is the same as:

tick = ns / ns_per_tick;

Which also means that it does not round up.

> +
> +	return tick;
> +}
> +
> +static inline uint64_t
> +tick_to_ns(uint64_t tick)
> +{
> +	return tick * ns_per_tick;
> +}
> +
> +static struct htimer_mgr *
> +mgr_get(unsigned int lcore_id)
> +{
> +	return &mgrs[lcore_id];
> +}
> +
> +static int
> +mgr_init(unsigned int lcore_id)
> +{
> +	char ring_name[RTE_RING_NAMESIZE];
> +	unsigned int socket_id;
> +	struct htimer_mgr *mgr = &mgrs[lcore_id];
> +
> +	socket_id = rte_lcore_to_socket_id(lcore_id);
> +
> +	snprintf(ring_name, sizeof(ring_name), "htimer_%d", lcore_id);
> +
> +	mgr->msg_ring =
> +		rte_htimer_msg_ring_create(ring_name, MSG_RING_SIZE, socket_id,
> +					   RING_F_SC_DEQ);
> +
> +	if (mgr->msg_ring == NULL)
> +		goto err;
> +
> +	mgr->htw = rte_htw_create();
> +
> +	if (mgr->htw == NULL)
> +		goto err_free_ring;
> +
> +	mgr->async_msgs_idx = 0;
> +	mgr->num_async_msgs = 0;
> +
> +	return 0;
> +
> +err_free_ring:
> +	rte_htimer_msg_ring_free(mgr->msg_ring);
> +err:
> +	return -ENOMEM;
> +}
> +
> +static void
> +mgr_deinit(unsigned int lcore_id)
> +{
> +	struct htimer_mgr *mgr = &mgrs[lcore_id];
> +
> +	rte_htw_destroy(mgr->htw);
> +
> +	rte_htimer_msg_ring_free(mgr->msg_ring);
> +}
> +
> +static volatile bool initialized;
> +
> +static void
> +assure_initialized(void)
> +{
> +	RTE_ASSERT(initialized);
> +}
> +
> +int
> +rte_htimer_mgr_init(uint64_t _ns_per_tick)
> +{
> +	unsigned int lcore_id;
> +
> +	RTE_VERIFY(!initialized);
> +
> +	ns_per_tick = _ns_per_tick;
> +
> +	tsc_per_tick = (ns_per_tick / 1e9) * rte_get_tsc_hz();
> +
> +	for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
> +		int rc;
> +
> +		rc = mgr_init(lcore_id);
> +
> +		if (rc < 0) {
> +			unsigned int deinit_lcore_id;
> +
> +			for (deinit_lcore_id = 0; deinit_lcore_id < lcore_id;
> +			     deinit_lcore_id++)
> +				mgr_deinit(deinit_lcore_id);
> +
> +			return rc;
> +		}
> +	}
> +
> +	initialized = true;
> +
> +	return 0;
> +}
> +
> +void
> +rte_htimer_mgr_deinit(void)
> +{
> +	unsigned int lcore_id;
> +
> +	assure_initialized();
> +
> +	for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++)
> +		mgr_deinit(lcore_id);
> +
> +	initialized = false;
> +}
> +
> +static void
> +assure_valid_time_conversion_flags(uint32_t flags __rte_unused)
> +{
> +	RTE_ASSERT(!((flags & RTE_HTIMER_FLAG_TIME_TSC) &&
> +		     (flags & RTE_HTIMER_FLAG_TIME_TICK)));

With my above suggestion of using tick as default time unit, this would be changed to:

+	RTE_ASSERT(!((flags & RTE_HTIMER_FLAG_TIME_TSC) &&
+		     (flags & RTE_HTIMER_FLAG_TIME_NS) &&
+		     (flags & RTE_HTIMER_FLAG_TIME_US)));

> +}
> +
> +static void
> +assure_valid_add_flags(uint32_t flags)
> +{
> +	assure_valid_time_conversion_flags(flags);
> +
> +	RTE_ASSERT(!(flags & ~(RTE_HTIMER_FLAG_PERIODICAL |
> +			       RTE_HTIMER_FLAG_ABSOLUTE_TIME |
> +			       RTE_HTIMER_FLAG_TIME_TSC |
> +			       RTE_HTIMER_FLAG_TIME_TICK)));

With my above suggestion of using tick as default time unit, this would be changed to:

+	RTE_ASSERT(!(flags & ~(RTE_HTIMER_FLAG_PERIODICAL |
+			       RTE_HTIMER_FLAG_ABSOLUTE_TIME |
+			       RTE_HTIMER_FLAG_TIME_TSC |
+			       RTE_HTIMER_FLAG_TIME_NS |
+			       RTE_HTIMER_FLAG_TIME_US)));

> +}
> +
> +static uint64_t
> +convert_time(uint64_t t, uint32_t flags)
> +{
> +	if (flags & RTE_HTIMER_FLAG_TIME_TSC)
> +		return tsc_to_tick(t);
> +	else if (flags & RTE_HTIMER_FLAG_TIME_TICK)
> +		return t;
> +	else
> +		return ns_to_tick(t);

With my above suggestion of using tick as default time unit, this would be changed to:

+	if (flags & RTE_HTIMER_FLAG_TIME_TSC)
+		return tsc_to_tick(t);
+	else if (flags & RTE_HTIMER_FLAG_TIME_NS)
+		return ns_to_tick(t);
+	else if (flags & RTE_HTIMER_FLAG_TIME_US)
+		return us_to_tick(t);
+	else
+		return t;

> +}
> +
> +void
> +rte_htimer_mgr_add(struct rte_htimer *timer, uint64_t expiration_time,
> +		   uint64_t period, rte_htimer_cb_t timer_cb,
> +		   void *timer_cb_arg, uint32_t flags)
> +{
> +	unsigned int lcore_id = rte_lcore_id();
> +	struct htimer_mgr *mgr = mgr_get(lcore_id);
> +	uint64_t expiration_time_tick;
> +	uint64_t period_tick;
> +
> +	assure_initialized();
> +
> +	assure_valid_add_flags(flags);
> +
> +	expiration_time_tick = convert_time(expiration_time, flags);
> +
> +	period_tick = convert_time(period, flags);
> +
> +	rte_htw_add(mgr->htw, timer, expiration_time_tick, period_tick,
> +		    timer_cb, timer_cb_arg, flags);
> +
> +	timer->owner_lcore_id = lcore_id;
> +}

  parent reply	other threads:[~2023-03-24 16:00 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-28  9:39 [RFC 0/2] Add " Mattias Rönnblom
2023-02-28  9:39 ` [RFC 1/2] eal: add bitset type Mattias Rönnblom
2023-02-28 18:46   ` Tyler Retzlaff
2023-03-02  6:31     ` Mattias Rönnblom
2023-03-02 20:39       ` Tyler Retzlaff
2023-02-28  9:39 ` [RFC 2/2] eal: add high-performance timer facility Mattias Rönnblom
2023-03-05 17:25   ` Stephen Hemminger
2023-03-09 15:20     ` Mattias Rönnblom
2023-02-28 16:01 ` [RFC 0/2] Add " Morten Brørup
2023-03-01 11:18   ` Mattias Rönnblom
2023-03-01 13:31     ` Morten Brørup
2023-03-01 15:50       ` Mattias Rönnblom
2023-03-01 17:06         ` Morten Brørup
2023-03-15 17:03 ` [RFC v2 " Mattias Rönnblom
2023-03-15 17:03   ` [RFC v2 1/2] eal: add bitset type Mattias Rönnblom
2023-03-15 17:20     ` Stephen Hemminger
2023-03-15 18:27       ` Mattias Rönnblom
2023-03-15 17:03   ` [RFC v2 2/2] eal: add high-performance timer facility Mattias Rönnblom
2023-03-16  3:55     ` Tyler Retzlaff
2023-03-17  1:58     ` Stephen Hemminger
2023-03-22 12:18     ` Morten Brørup
2023-04-03 12:04       ` Mattias Rönnblom
2023-04-04  7:32         ` Morten Brørup
2023-03-24 16:00     ` Morten Brørup [this message]
2023-07-06 22:41     ` Stephen Hemminger
2023-07-12  8:58       ` Mattias Rönnblom
2024-10-03 18:36   ` [RFC v2 0/2] Add " Stephen Hemminger
2024-10-03 21:32     ` Morten Brørup
2024-10-06 13:02       ` Mattias Rönnblom
2024-10-06 13:43         ` Morten Brørup
2024-10-06 14:43           ` Mattias Rönnblom

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=98CBD80474FA8B44BF855DF32C47DC35D87809@smartserver.smartshare.dk \
    --to=mb@smartsharesystems.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=erik.g.carrillo@intel.com \
    --cc=maria.lingemark@ericsson.com \
    --cc=mattias.ronnblom@ericsson.com \
    --cc=roretzla@linux.microsoft.com \
    --cc=stefan.sundkvist@ericsson.com \
    --cc=stephen@networkplumber.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).