From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 790CB41EAB;
	Thu, 16 Mar 2023 04:55:58 +0100 (CET)
Received: from mails.dpdk.org (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id 4A0EE410EC;
	Thu, 16 Mar 2023 04:55:58 +0100 (CET)
Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182])
 by mails.dpdk.org (Postfix) with ESMTP id C53B840EF1
 for <dev@dpdk.org>; Thu, 16 Mar 2023 04:55:56 +0100 (CET)
Received: by linux.microsoft.com (Postfix, from userid 1086)
 id 21C44205701F; Wed, 15 Mar 2023 20:55:56 -0700 (PDT)
DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 21C44205701F
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com;
 s=default; t=1678938956;
 bh=pUm1MDjgintcmiltDaIetopC5YVvLPig6pFW/A9bWfk=;
 h=Date:From:To:Cc:Subject:References:In-Reply-To:From;
 b=F19sMRklGnSaglMzILzdBpXPnCw3qkmvUXmWn4dFEk/PdKJ/owdVbEs/YppTNsdPh
 pI0b3z4/Fi/WfSxRBA4Jd10F7ZO6JNOp367OZc3HnYaJp5UulzsE/fIhg6fPpThDKG
 WhO5iZbM0nK5leGRQk1UXznCBXf/Jx/EUtmyWT/U=
Date: Wed, 15 Mar 2023 20:55:56 -0700
From: Tyler Retzlaff <roretzla@linux.microsoft.com>
To: Mattias =?iso-8859-1?Q?R=F6nnblom?= <mattias.ronnblom@ericsson.com>
Cc: dev@dpdk.org, Erik Gabriel Carrillo <erik.g.carrillo@intel.com>,
 David Marchand <david.marchand@redhat.com>, maria.lingemark@ericsson.com,
 Stefan Sundkvist <stefan.sundkvist@ericsson.com>,
 Stephen Hemminger <stephen@networkplumber.org>,
 Morten =?iso-8859-1?Q?Br=F8rup?= <mb@smartsharesystems.com>
Subject: Re: [RFC v2 2/2] eal: add high-performance timer facility
Message-ID: <20230316035556.GB5703@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net>
References: <20230228093916.87206-1-mattias.ronnblom@ericsson.com>
 <20230315170342.214127-1-mattias.ronnblom@ericsson.com>
 <20230315170342.214127-3-mattias.ronnblom@ericsson.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <20230315170342.214127-3-mattias.ronnblom@ericsson.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org

On Wed, Mar 15, 2023 at 06:03:42PM +0100, Mattias Rönnblom wrote:
> The htimer library attempts at providing a timer facility with roughly
> the same functionality, but less overhead and better scalability than
> DPDK timer library.
> 
> The htimer library employs per-lcore hierarchical timer wheels and a
> message-based synchronization/MT-safety scheme.
> 
> RFC v2:
>  * Fix spelling.
>  * Fix signed/unsigned comparisons and discontinue the use of name-less
>    function parameters, both of which may result in compiler warnings.
>  * Undo the accidental removal of the bitset tests from the 'fast_tests'.
>  * Add a number of missing include files, causing build failures
>    (e.g., on AArch64 builds).
>  * Add perf test attempting to compare rte_timer, rte_htimer and rte_htw.
>  * Use nanoseconds (instead of TSC) as the default time unit.
>  * add() and manage() has flags which allows the caller to specify the
>    time unit (nanoseconds, TSC, or ticks) for the times provided.
> 
> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> ---
>  app/test/meson.build                  |   8 +
>  app/test/test_htimer_mgr.c            | 674 +++++++++++++++++++++++++
>  app/test/test_htimer_mgr_perf.c       | 322 ++++++++++++
>  app/test/test_htw.c                   | 478 ++++++++++++++++++
>  app/test/test_htw_perf.c              | 181 +++++++
>  app/test/test_timer_htimer_htw_perf.c | 693 ++++++++++++++++++++++++++
>  doc/api/doxy-api-index.md             |   5 +-
>  doc/api/doxy-api.conf.in              |   1 +
>  lib/htimer/meson.build                |   7 +
>  lib/htimer/rte_htimer.h               |  68 +++
>  lib/htimer/rte_htimer_mgr.c           | 547 ++++++++++++++++++++
>  lib/htimer/rte_htimer_mgr.h           | 516 +++++++++++++++++++
>  lib/htimer/rte_htimer_msg.h           |  44 ++
>  lib/htimer/rte_htimer_msg_ring.c      |  18 +
>  lib/htimer/rte_htimer_msg_ring.h      |  55 ++
>  lib/htimer/rte_htw.c                  | 445 +++++++++++++++++
>  lib/htimer/rte_htw.h                  |  49 ++
>  lib/htimer/version.map                |  17 +
>  lib/meson.build                       |   1 +
>  19 files changed, 4128 insertions(+), 1 deletion(-)
>  create mode 100644 app/test/test_htimer_mgr.c
>  create mode 100644 app/test/test_htimer_mgr_perf.c
>  create mode 100644 app/test/test_htw.c
>  create mode 100644 app/test/test_htw_perf.c
>  create mode 100644 app/test/test_timer_htimer_htw_perf.c
>  create mode 100644 lib/htimer/meson.build
>  create mode 100644 lib/htimer/rte_htimer.h
>  create mode 100644 lib/htimer/rte_htimer_mgr.c
>  create mode 100644 lib/htimer/rte_htimer_mgr.h
>  create mode 100644 lib/htimer/rte_htimer_msg.h
>  create mode 100644 lib/htimer/rte_htimer_msg_ring.c
>  create mode 100644 lib/htimer/rte_htimer_msg_ring.h
>  create mode 100644 lib/htimer/rte_htw.c
>  create mode 100644 lib/htimer/rte_htw.h
>  create mode 100644 lib/htimer/version.map
> 
> diff --git a/app/test/meson.build b/app/test/meson.build
> index 03811ff692..d0308ac09d 100644
> --- a/app/test/meson.build
> +++ b/app/test/meson.build
> @@ -140,9 +140,14 @@ test_sources = files(
>          'test_thash_perf.c',
>          'test_threads.c',
>          'test_timer.c',
> +        'test_timer_htimer_htw_perf.c',
>          'test_timer_perf.c',
>          'test_timer_racecond.c',
>          'test_timer_secondary.c',
> +        'test_htimer_mgr.c',
> +        'test_htimer_mgr_perf.c',
> +        'test_htw.c',
> +        'test_htw_perf.c',
>          'test_ticketlock.c',
>          'test_trace.c',
>          'test_trace_register.c',
> @@ -193,6 +198,7 @@ fast_tests = [
>          ['fib6_autotest', true, true],
>          ['func_reentrancy_autotest', false, true],
>          ['hash_autotest', true, true],
> +        ['htimer_mgr_autotest', true, true],
>          ['interrupt_autotest', true, true],
>          ['ipfrag_autotest', false, true],
>          ['lcores_autotest', true, true],
> @@ -265,6 +271,8 @@ perf_test_names = [
>          'memcpy_perf_autotest',
>          'hash_perf_autotest',
>          'timer_perf_autotest',
> +        'htimer_mgr_perf_autotest',
> +        'htw_perf_autotest',
>          'reciprocal_division',
>          'reciprocal_division_perf',
>          'lpm_perf_autotest',
> diff --git a/app/test/test_htimer_mgr.c b/app/test/test_htimer_mgr.c
> new file mode 100644
> index 0000000000..9e46dec53e
> --- /dev/null
> +++ b/app/test/test_htimer_mgr.c
> @@ -0,0 +1,674 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2023 Ericsson AB
> + */
> +
> +#include "test.h"
> +
> +#include <sys/queue.h>
> +#include <stdlib.h>
> +#include <inttypes.h>
> +
> +#include <rte_common.h>
> +#include <rte_cycles.h>
> +#include <rte_htimer_mgr.h>
> +#include <rte_launch.h>
> +#include <rte_lcore.h>
> +#include <rte_random.h>
> +
> +static int
> +timer_lcore(void *arg)
> +{
> +	bool *stop = arg;
> +
> +	while (!__atomic_load_n(stop, __ATOMIC_RELAXED))
> +		rte_htimer_mgr_manage();
> +
> +	return 0;
> +}
> +
> +static void
> +count_timer_cb(struct rte_htimer *timer __rte_unused, void *arg)
> +{
> +	unsigned int *count = arg;
> +
> +	__atomic_fetch_add(count, 1, __ATOMIC_RELAXED);
> +}
> +
> +static void
> +count_async_cb(struct rte_htimer *timer __rte_unused, int result,
> +	       void *cb_arg)
> +{
> +	unsigned int *count = cb_arg;
> +
> +	if (result == RTE_HTIMER_MGR_ASYNC_RESULT_ADDED)
> +		__atomic_fetch_add(count, 1, __ATOMIC_RELAXED);
> +}
> +
> +static uint64_t
> +s_to_tsc(double s)
> +{
> +	return s * rte_get_tsc_hz();
> +}
> +
> +#define ASYNC_ADD_TEST_EXPIRATION_TIME (250*1000) /* ns */
> +#define ASYNC_TEST_TICK (10*1000) /* ns */
> +
> +static int
> +test_htimer_mgr_async_add(unsigned int num_timers_per_lcore)
> +{
> +	struct rte_htimer *timers;
> +	unsigned int timer_idx;
> +	unsigned int lcore_id;
> +	bool stop = false;
> +	unsigned int timeout_count = 0;
> +	unsigned int async_count = 0;
> +	unsigned int num_workers = 0;
> +	uint64_t expiration_time;
> +	unsigned int num_total_timers;
> +
> +	rte_htimer_mgr_init(ASYNC_TEST_TICK);
> +
> +	RTE_LCORE_FOREACH_WORKER(lcore_id) {
> +		if (rte_eal_remote_launch(timer_lcore, &stop, lcore_id) != 0)
> +			rte_panic("Unable to launch timer lcore\n");
> +		num_workers++;
> +	}
> +
> +	num_total_timers = num_workers * num_timers_per_lcore;
> +
> +	timers = malloc(num_total_timers * sizeof(struct rte_htimer));
> +	timer_idx = 0;
> +
> +	if (timers == NULL)
> +		rte_panic("Unable to allocate heap memory\n");
> +
> +	expiration_time = ASYNC_ADD_TEST_EXPIRATION_TIME;
> +
> +	RTE_LCORE_FOREACH_WORKER(lcore_id) {
> +		unsigned int i;
> +
> +		for (i = 0; i < num_timers_per_lcore; i++) {
> +			struct rte_htimer *timer = &timers[timer_idx++];
> +
> +			for (;;) {
> +				int rc;
> +
> +				rc = rte_htimer_mgr_async_add(timer, lcore_id,
> +						      expiration_time,
> +						      RTE_HTIMER_FLAG_TIME_TSC,
> +						      count_timer_cb,
> +						      &timeout_count, 0,
> +						      count_async_cb,
> +						      &async_count);
> +				if (unlikely(rc == -EBUSY))
> +					rte_htimer_mgr_process();
> +				else
> +					break;
> +			}
> +		}
> +	}
> +
> +	while (__atomic_load_n(&async_count, __ATOMIC_RELAXED) !=
> +	       num_total_timers ||
> +	       __atomic_load_n(&timeout_count, __ATOMIC_RELAXED) !=
> +	       num_total_timers)
> +		rte_htimer_mgr_manage();
> +
> +	__atomic_store_n(&stop, true, __ATOMIC_RELAXED);
> +
> +	rte_eal_mp_wait_lcore();
> +
> +	rte_htimer_mgr_deinit();
> +
> +	free(timers);
> +
> +	return TEST_SUCCESS;
> +}
> +
> +struct async_recorder_state {
> +	bool timer_cb_run;
> +	bool async_add_cb_run;
> +	bool async_cancel_cb_run;
> +	bool failed;
> +};
> +
> +static void
> +record_async_add_cb(struct rte_htimer *timer __rte_unused,
> +		    int result, void *cb_arg)
> +{
> +	struct async_recorder_state *state = cb_arg;
> +
> +	if (state->failed)
> +		return;
> +
> +	if (state->async_add_cb_run ||
> +	    result != RTE_HTIMER_MGR_ASYNC_RESULT_ADDED) {
> +		puts("async add run already");
> +		state->failed = true;
> +	}
> +
> +	state->async_add_cb_run = true;
> +}
> +
> +static void
> +record_async_cancel_cb(struct rte_htimer *timer __rte_unused,
> +		       int result, void *cb_arg)
> +{
> +	struct async_recorder_state *state = cb_arg;
> +
> +	if (state->failed)
> +		return;
> +
> +	if (state->async_cancel_cb_run) {
> +		state->failed = true;
> +		return;
> +	}
> +
> +	switch (result) {
> +	case RTE_HTIMER_MGR_ASYNC_RESULT_EXPIRED:
> +		if (!state->timer_cb_run)
> +			state->failed = true;
> +		break;
> +	case RTE_HTIMER_MGR_ASYNC_RESULT_CANCELED:
> +		if (state->timer_cb_run)
> +			state->failed = true;
> +		break;
> +	case RTE_HTIMER_MGR_ASYNC_RESULT_ALREADY_CANCELED:
> +		state->failed = true;
> +	}
> +
> +	state->async_cancel_cb_run = true;
> +}
> +
> +static int
> +record_check_consistency(struct async_recorder_state *state)
> +{
> +	if (state->failed)
> +		return -1;
> +
> +	return state->async_cancel_cb_run ? 1 : 0;
> +}
> +
> +static int
> +records_check_consistency(struct async_recorder_state *states,
> +			  unsigned int num_states)
> +{
> +	unsigned int i;
> +	int canceled = 0;
> +
> +	for (i = 0; i < num_states; i++) {
> +		int rc;
> +
> +		rc = record_check_consistency(&states[i]);
> +
> +		if (rc < 0)
> +			return -1;
> +		canceled += rc;
> +	}
> +
> +	return canceled;
> +}
> +
> +static void
> +log_timer_expiry_cb(struct rte_htimer *timer __rte_unused,
> +		    void *arg)
> +{
> +	bool *timer_run = arg;
> +
> +	*timer_run = true;
> +}
> +
> +
> +#define ASYNC_ADD_CANCEL_TEST_EXPIRATION_TIME_MAX 10e-3 /* s */
> +
> +static int
> +test_htimer_mgr_async_add_cancel(unsigned int num_timers_per_lcore)
> +{
> +	struct rte_htimer *timers;
> +	struct async_recorder_state *recorder_states;
> +	unsigned int timer_idx = 0;
> +	unsigned int lcore_id;
> +	uint64_t now;
> +	unsigned int num_workers = 0;
> +	bool stop = false;
> +	uint64_t max_expiration_time =
> +		s_to_tsc(ASYNC_ADD_CANCEL_TEST_EXPIRATION_TIME_MAX);
> +	unsigned int num_total_timers;
> +	int canceled = 0;
> +
> +	rte_htimer_mgr_init(ASYNC_TEST_TICK);
> +
> +	RTE_LCORE_FOREACH_WORKER(lcore_id) {
> +		if (rte_eal_remote_launch(timer_lcore, &stop, lcore_id) != 0)
> +			rte_panic("Unable to launch timer lcore\n");
> +		num_workers++;
> +	}
> +
> +	num_total_timers = num_workers * num_timers_per_lcore;
> +
> +	timers = malloc(num_total_timers * sizeof(struct rte_htimer));
> +	recorder_states =
> +		malloc(num_total_timers * sizeof(struct async_recorder_state));
> +
> +	if (timers == NULL || recorder_states == NULL)
> +		rte_panic("Unable to allocate heap memory\n");
> +
> +	now = rte_get_tsc_cycles();
> +
> +	RTE_LCORE_FOREACH_WORKER(lcore_id) {
> +		unsigned int i;
> +
> +		for (i = 0; i < num_timers_per_lcore; i++) {
> +			struct rte_htimer *timer = &timers[timer_idx];
> +			struct async_recorder_state *state =
> +				&recorder_states[timer_idx];
> +
> +			timer_idx++;
> +
> +			*state = (struct async_recorder_state) {};
> +
> +			uint64_t expiration_time =
> +				now + rte_rand_max(max_expiration_time);
> +
> +			for (;;) {
> +				int rc;
> +
> +				rc = rte_htimer_mgr_async_add(timer, lcore_id,
> +							 expiration_time,
> +							 0,
> +							 log_timer_expiry_cb,
> +							 &state->timer_cb_run,
> +							 0,
> +							 record_async_add_cb,
> +							 state);
> +
> +				if (unlikely(rc == -EBUSY))
> +					rte_htimer_mgr_process();
> +				else
> +					break;
> +			}
> +		}
> +	}
> +
> +	timer_idx = 0;
> +
> +	RTE_LCORE_FOREACH_WORKER(lcore_id) {
> +		unsigned int i;
> +
> +		for (i = 0; i < num_timers_per_lcore; i++) {
> +			struct rte_htimer *timer = &timers[timer_idx];
> +			struct async_recorder_state *state =
> +				&recorder_states[timer_idx];
> +
> +			timer_idx++;
> +
> +			/* cancel roughly half of the timers */
> +			if (rte_rand_max(2) == 0)
> +				continue;
> +
> +			for (;;) {
> +				int rc;
> +
> +				rc = rte_htimer_mgr_async_cancel(timer,
> +							record_async_cancel_cb,
> +							state);
> +
> +				if (unlikely(rc == -EBUSY)) {
> +					puts("busy");
> +					rte_htimer_mgr_process();
> +				} else
> +					break;
> +			}
> +
> +			canceled++;
> +		}
> +	}
> +
> +	for (;;) {
> +		int cancel_completed;
> +
> +		cancel_completed = records_check_consistency(recorder_states,
> +							     num_total_timers);
> +
> +		if (cancel_completed < 0) {
> +			puts("Inconstinency found");
> +			return TEST_FAILED;
> +		}
> +
> +		if (cancel_completed == canceled)
> +			break;
> +
> +		rte_htimer_mgr_process();
> +	}
> +
> +	__atomic_store_n(&stop, true, __ATOMIC_RELAXED);
> +
> +	rte_eal_mp_wait_lcore();
> +
> +	rte_htimer_mgr_deinit();
> +
> +	free(timers);
> +	free(recorder_states);
> +
> +	return TEST_SUCCESS;
> +}
> +
> +/*
> + * This is a test case where one thread asynchronously adds two timers,
> + * with the same expiration time; one on the local lcore and one on a
> + * remote lcore. This creates a tricky situation for the timer
> + * manager, and for the application as well, if the htimer struct is
> + * dynamically allocated.
> + */
> +
> +struct test_timer {
> +	uint32_t ref_cnt;
> +	uint64_t expiration_time; /* in TSC, not tick */
> +	uint32_t *timeout_count;
> +	bool *failure_occurred;
> +	struct rte_htimer htimer;
> +};
> +
> +
> +static struct test_timer *
> +test_timer_create(uint64_t expiration_time, uint32_t *timeout_count,
> +		  bool *failure_occurred)
> +{
> +	struct test_timer *timer;
> +
> +	timer = malloc(sizeof(struct test_timer));
> +
> +	if (timer == NULL)
> +		rte_panic("Unable to allocate timer memory\n");
> +
> +	timer->ref_cnt = 1;
> +	timer->expiration_time = expiration_time;
> +	timer->timeout_count = timeout_count;
> +	timer->failure_occurred = failure_occurred;
> +
> +	return timer;
> +}
> +
> +static void
> +test_timer_inc_ref_cnt(struct test_timer *timer)
> +{
> +	__atomic_add_fetch(&timer->ref_cnt, 1, __ATOMIC_RELEASE);

__atomic_fetch_add instead please

there's future work to align with C11 atomics using the previous
__atomic_fetch_<op> is preferred because it just becomes
s/__atomic/atomic/ (well mostly...)


> +}
> +
> +static void
> +test_timer_dec_ref_cnt(struct test_timer *timer)
> +{
> +	if (timer != NULL) {
> +		uint32_t cnt = __atomic_sub_fetch(&timer->ref_cnt, 1,
> +						  __ATOMIC_RELEASE);

same here

i'll try to get a patch up for checkpatches warning soon.

thanks!