From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 790CB41EAB; Thu, 16 Mar 2023 04:55:58 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 4A0EE410EC; Thu, 16 Mar 2023 04:55:58 +0100 (CET) Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by mails.dpdk.org (Postfix) with ESMTP id C53B840EF1 for ; Thu, 16 Mar 2023 04:55:56 +0100 (CET) Received: by linux.microsoft.com (Postfix, from userid 1086) id 21C44205701F; Wed, 15 Mar 2023 20:55:56 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 21C44205701F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1678938956; bh=pUm1MDjgintcmiltDaIetopC5YVvLPig6pFW/A9bWfk=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=F19sMRklGnSaglMzILzdBpXPnCw3qkmvUXmWn4dFEk/PdKJ/owdVbEs/YppTNsdPh pI0b3z4/Fi/WfSxRBA4Jd10F7ZO6JNOp367OZc3HnYaJp5UulzsE/fIhg6fPpThDKG WhO5iZbM0nK5leGRQk1UXznCBXf/Jx/EUtmyWT/U= Date: Wed, 15 Mar 2023 20:55:56 -0700 From: Tyler Retzlaff To: Mattias =?iso-8859-1?Q?R=F6nnblom?= Cc: dev@dpdk.org, Erik Gabriel Carrillo , David Marchand , maria.lingemark@ericsson.com, Stefan Sundkvist , Stephen Hemminger , Morten =?iso-8859-1?Q?Br=F8rup?= Subject: Re: [RFC v2 2/2] eal: add high-performance timer facility Message-ID: <20230316035556.GB5703@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> References: <20230228093916.87206-1-mattias.ronnblom@ericsson.com> <20230315170342.214127-1-mattias.ronnblom@ericsson.com> <20230315170342.214127-3-mattias.ronnblom@ericsson.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20230315170342.214127-3-mattias.ronnblom@ericsson.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Wed, Mar 15, 2023 at 06:03:42PM +0100, Mattias Rönnblom wrote: > The htimer library attempts at providing a timer facility with roughly > the same functionality, but less overhead and better scalability than > DPDK timer library. > > The htimer library employs per-lcore hierarchical timer wheels and a > message-based synchronization/MT-safety scheme. > > RFC v2: > * Fix spelling. > * Fix signed/unsigned comparisons and discontinue the use of name-less > function parameters, both of which may result in compiler warnings. > * Undo the accidental removal of the bitset tests from the 'fast_tests'. > * Add a number of missing include files, causing build failures > (e.g., on AArch64 builds). > * Add perf test attempting to compare rte_timer, rte_htimer and rte_htw. > * Use nanoseconds (instead of TSC) as the default time unit. > * add() and manage() has flags which allows the caller to specify the > time unit (nanoseconds, TSC, or ticks) for the times provided. > > Signed-off-by: Mattias Rönnblom > --- > app/test/meson.build | 8 + > app/test/test_htimer_mgr.c | 674 +++++++++++++++++++++++++ > app/test/test_htimer_mgr_perf.c | 322 ++++++++++++ > app/test/test_htw.c | 478 ++++++++++++++++++ > app/test/test_htw_perf.c | 181 +++++++ > app/test/test_timer_htimer_htw_perf.c | 693 ++++++++++++++++++++++++++ > doc/api/doxy-api-index.md | 5 +- > doc/api/doxy-api.conf.in | 1 + > lib/htimer/meson.build | 7 + > lib/htimer/rte_htimer.h | 68 +++ > lib/htimer/rte_htimer_mgr.c | 547 ++++++++++++++++++++ > lib/htimer/rte_htimer_mgr.h | 516 +++++++++++++++++++ > lib/htimer/rte_htimer_msg.h | 44 ++ > lib/htimer/rte_htimer_msg_ring.c | 18 + > lib/htimer/rte_htimer_msg_ring.h | 55 ++ > lib/htimer/rte_htw.c | 445 +++++++++++++++++ > lib/htimer/rte_htw.h | 49 ++ > lib/htimer/version.map | 17 + > lib/meson.build | 1 + > 19 files changed, 4128 insertions(+), 1 deletion(-) > create mode 100644 app/test/test_htimer_mgr.c > create mode 100644 app/test/test_htimer_mgr_perf.c > create mode 100644 app/test/test_htw.c > create mode 100644 app/test/test_htw_perf.c > create mode 100644 app/test/test_timer_htimer_htw_perf.c > create mode 100644 lib/htimer/meson.build > create mode 100644 lib/htimer/rte_htimer.h > create mode 100644 lib/htimer/rte_htimer_mgr.c > create mode 100644 lib/htimer/rte_htimer_mgr.h > create mode 100644 lib/htimer/rte_htimer_msg.h > create mode 100644 lib/htimer/rte_htimer_msg_ring.c > create mode 100644 lib/htimer/rte_htimer_msg_ring.h > create mode 100644 lib/htimer/rte_htw.c > create mode 100644 lib/htimer/rte_htw.h > create mode 100644 lib/htimer/version.map > > diff --git a/app/test/meson.build b/app/test/meson.build > index 03811ff692..d0308ac09d 100644 > --- a/app/test/meson.build > +++ b/app/test/meson.build > @@ -140,9 +140,14 @@ test_sources = files( > 'test_thash_perf.c', > 'test_threads.c', > 'test_timer.c', > + 'test_timer_htimer_htw_perf.c', > 'test_timer_perf.c', > 'test_timer_racecond.c', > 'test_timer_secondary.c', > + 'test_htimer_mgr.c', > + 'test_htimer_mgr_perf.c', > + 'test_htw.c', > + 'test_htw_perf.c', > 'test_ticketlock.c', > 'test_trace.c', > 'test_trace_register.c', > @@ -193,6 +198,7 @@ fast_tests = [ > ['fib6_autotest', true, true], > ['func_reentrancy_autotest', false, true], > ['hash_autotest', true, true], > + ['htimer_mgr_autotest', true, true], > ['interrupt_autotest', true, true], > ['ipfrag_autotest', false, true], > ['lcores_autotest', true, true], > @@ -265,6 +271,8 @@ perf_test_names = [ > 'memcpy_perf_autotest', > 'hash_perf_autotest', > 'timer_perf_autotest', > + 'htimer_mgr_perf_autotest', > + 'htw_perf_autotest', > 'reciprocal_division', > 'reciprocal_division_perf', > 'lpm_perf_autotest', > diff --git a/app/test/test_htimer_mgr.c b/app/test/test_htimer_mgr.c > new file mode 100644 > index 0000000000..9e46dec53e > --- /dev/null > +++ b/app/test/test_htimer_mgr.c > @@ -0,0 +1,674 @@ > +/* SPDX-License-Identifier: BSD-3-Clause > + * Copyright(c) 2023 Ericsson AB > + */ > + > +#include "test.h" > + > +#include > +#include > +#include > + > +#include > +#include > +#include > +#include > +#include > +#include > + > +static int > +timer_lcore(void *arg) > +{ > + bool *stop = arg; > + > + while (!__atomic_load_n(stop, __ATOMIC_RELAXED)) > + rte_htimer_mgr_manage(); > + > + return 0; > +} > + > +static void > +count_timer_cb(struct rte_htimer *timer __rte_unused, void *arg) > +{ > + unsigned int *count = arg; > + > + __atomic_fetch_add(count, 1, __ATOMIC_RELAXED); > +} > + > +static void > +count_async_cb(struct rte_htimer *timer __rte_unused, int result, > + void *cb_arg) > +{ > + unsigned int *count = cb_arg; > + > + if (result == RTE_HTIMER_MGR_ASYNC_RESULT_ADDED) > + __atomic_fetch_add(count, 1, __ATOMIC_RELAXED); > +} > + > +static uint64_t > +s_to_tsc(double s) > +{ > + return s * rte_get_tsc_hz(); > +} > + > +#define ASYNC_ADD_TEST_EXPIRATION_TIME (250*1000) /* ns */ > +#define ASYNC_TEST_TICK (10*1000) /* ns */ > + > +static int > +test_htimer_mgr_async_add(unsigned int num_timers_per_lcore) > +{ > + struct rte_htimer *timers; > + unsigned int timer_idx; > + unsigned int lcore_id; > + bool stop = false; > + unsigned int timeout_count = 0; > + unsigned int async_count = 0; > + unsigned int num_workers = 0; > + uint64_t expiration_time; > + unsigned int num_total_timers; > + > + rte_htimer_mgr_init(ASYNC_TEST_TICK); > + > + RTE_LCORE_FOREACH_WORKER(lcore_id) { > + if (rte_eal_remote_launch(timer_lcore, &stop, lcore_id) != 0) > + rte_panic("Unable to launch timer lcore\n"); > + num_workers++; > + } > + > + num_total_timers = num_workers * num_timers_per_lcore; > + > + timers = malloc(num_total_timers * sizeof(struct rte_htimer)); > + timer_idx = 0; > + > + if (timers == NULL) > + rte_panic("Unable to allocate heap memory\n"); > + > + expiration_time = ASYNC_ADD_TEST_EXPIRATION_TIME; > + > + RTE_LCORE_FOREACH_WORKER(lcore_id) { > + unsigned int i; > + > + for (i = 0; i < num_timers_per_lcore; i++) { > + struct rte_htimer *timer = &timers[timer_idx++]; > + > + for (;;) { > + int rc; > + > + rc = rte_htimer_mgr_async_add(timer, lcore_id, > + expiration_time, > + RTE_HTIMER_FLAG_TIME_TSC, > + count_timer_cb, > + &timeout_count, 0, > + count_async_cb, > + &async_count); > + if (unlikely(rc == -EBUSY)) > + rte_htimer_mgr_process(); > + else > + break; > + } > + } > + } > + > + while (__atomic_load_n(&async_count, __ATOMIC_RELAXED) != > + num_total_timers || > + __atomic_load_n(&timeout_count, __ATOMIC_RELAXED) != > + num_total_timers) > + rte_htimer_mgr_manage(); > + > + __atomic_store_n(&stop, true, __ATOMIC_RELAXED); > + > + rte_eal_mp_wait_lcore(); > + > + rte_htimer_mgr_deinit(); > + > + free(timers); > + > + return TEST_SUCCESS; > +} > + > +struct async_recorder_state { > + bool timer_cb_run; > + bool async_add_cb_run; > + bool async_cancel_cb_run; > + bool failed; > +}; > + > +static void > +record_async_add_cb(struct rte_htimer *timer __rte_unused, > + int result, void *cb_arg) > +{ > + struct async_recorder_state *state = cb_arg; > + > + if (state->failed) > + return; > + > + if (state->async_add_cb_run || > + result != RTE_HTIMER_MGR_ASYNC_RESULT_ADDED) { > + puts("async add run already"); > + state->failed = true; > + } > + > + state->async_add_cb_run = true; > +} > + > +static void > +record_async_cancel_cb(struct rte_htimer *timer __rte_unused, > + int result, void *cb_arg) > +{ > + struct async_recorder_state *state = cb_arg; > + > + if (state->failed) > + return; > + > + if (state->async_cancel_cb_run) { > + state->failed = true; > + return; > + } > + > + switch (result) { > + case RTE_HTIMER_MGR_ASYNC_RESULT_EXPIRED: > + if (!state->timer_cb_run) > + state->failed = true; > + break; > + case RTE_HTIMER_MGR_ASYNC_RESULT_CANCELED: > + if (state->timer_cb_run) > + state->failed = true; > + break; > + case RTE_HTIMER_MGR_ASYNC_RESULT_ALREADY_CANCELED: > + state->failed = true; > + } > + > + state->async_cancel_cb_run = true; > +} > + > +static int > +record_check_consistency(struct async_recorder_state *state) > +{ > + if (state->failed) > + return -1; > + > + return state->async_cancel_cb_run ? 1 : 0; > +} > + > +static int > +records_check_consistency(struct async_recorder_state *states, > + unsigned int num_states) > +{ > + unsigned int i; > + int canceled = 0; > + > + for (i = 0; i < num_states; i++) { > + int rc; > + > + rc = record_check_consistency(&states[i]); > + > + if (rc < 0) > + return -1; > + canceled += rc; > + } > + > + return canceled; > +} > + > +static void > +log_timer_expiry_cb(struct rte_htimer *timer __rte_unused, > + void *arg) > +{ > + bool *timer_run = arg; > + > + *timer_run = true; > +} > + > + > +#define ASYNC_ADD_CANCEL_TEST_EXPIRATION_TIME_MAX 10e-3 /* s */ > + > +static int > +test_htimer_mgr_async_add_cancel(unsigned int num_timers_per_lcore) > +{ > + struct rte_htimer *timers; > + struct async_recorder_state *recorder_states; > + unsigned int timer_idx = 0; > + unsigned int lcore_id; > + uint64_t now; > + unsigned int num_workers = 0; > + bool stop = false; > + uint64_t max_expiration_time = > + s_to_tsc(ASYNC_ADD_CANCEL_TEST_EXPIRATION_TIME_MAX); > + unsigned int num_total_timers; > + int canceled = 0; > + > + rte_htimer_mgr_init(ASYNC_TEST_TICK); > + > + RTE_LCORE_FOREACH_WORKER(lcore_id) { > + if (rte_eal_remote_launch(timer_lcore, &stop, lcore_id) != 0) > + rte_panic("Unable to launch timer lcore\n"); > + num_workers++; > + } > + > + num_total_timers = num_workers * num_timers_per_lcore; > + > + timers = malloc(num_total_timers * sizeof(struct rte_htimer)); > + recorder_states = > + malloc(num_total_timers * sizeof(struct async_recorder_state)); > + > + if (timers == NULL || recorder_states == NULL) > + rte_panic("Unable to allocate heap memory\n"); > + > + now = rte_get_tsc_cycles(); > + > + RTE_LCORE_FOREACH_WORKER(lcore_id) { > + unsigned int i; > + > + for (i = 0; i < num_timers_per_lcore; i++) { > + struct rte_htimer *timer = &timers[timer_idx]; > + struct async_recorder_state *state = > + &recorder_states[timer_idx]; > + > + timer_idx++; > + > + *state = (struct async_recorder_state) {}; > + > + uint64_t expiration_time = > + now + rte_rand_max(max_expiration_time); > + > + for (;;) { > + int rc; > + > + rc = rte_htimer_mgr_async_add(timer, lcore_id, > + expiration_time, > + 0, > + log_timer_expiry_cb, > + &state->timer_cb_run, > + 0, > + record_async_add_cb, > + state); > + > + if (unlikely(rc == -EBUSY)) > + rte_htimer_mgr_process(); > + else > + break; > + } > + } > + } > + > + timer_idx = 0; > + > + RTE_LCORE_FOREACH_WORKER(lcore_id) { > + unsigned int i; > + > + for (i = 0; i < num_timers_per_lcore; i++) { > + struct rte_htimer *timer = &timers[timer_idx]; > + struct async_recorder_state *state = > + &recorder_states[timer_idx]; > + > + timer_idx++; > + > + /* cancel roughly half of the timers */ > + if (rte_rand_max(2) == 0) > + continue; > + > + for (;;) { > + int rc; > + > + rc = rte_htimer_mgr_async_cancel(timer, > + record_async_cancel_cb, > + state); > + > + if (unlikely(rc == -EBUSY)) { > + puts("busy"); > + rte_htimer_mgr_process(); > + } else > + break; > + } > + > + canceled++; > + } > + } > + > + for (;;) { > + int cancel_completed; > + > + cancel_completed = records_check_consistency(recorder_states, > + num_total_timers); > + > + if (cancel_completed < 0) { > + puts("Inconstinency found"); > + return TEST_FAILED; > + } > + > + if (cancel_completed == canceled) > + break; > + > + rte_htimer_mgr_process(); > + } > + > + __atomic_store_n(&stop, true, __ATOMIC_RELAXED); > + > + rte_eal_mp_wait_lcore(); > + > + rte_htimer_mgr_deinit(); > + > + free(timers); > + free(recorder_states); > + > + return TEST_SUCCESS; > +} > + > +/* > + * This is a test case where one thread asynchronously adds two timers, > + * with the same expiration time; one on the local lcore and one on a > + * remote lcore. This creates a tricky situation for the timer > + * manager, and for the application as well, if the htimer struct is > + * dynamically allocated. > + */ > + > +struct test_timer { > + uint32_t ref_cnt; > + uint64_t expiration_time; /* in TSC, not tick */ > + uint32_t *timeout_count; > + bool *failure_occurred; > + struct rte_htimer htimer; > +}; > + > + > +static struct test_timer * > +test_timer_create(uint64_t expiration_time, uint32_t *timeout_count, > + bool *failure_occurred) > +{ > + struct test_timer *timer; > + > + timer = malloc(sizeof(struct test_timer)); > + > + if (timer == NULL) > + rte_panic("Unable to allocate timer memory\n"); > + > + timer->ref_cnt = 1; > + timer->expiration_time = expiration_time; > + timer->timeout_count = timeout_count; > + timer->failure_occurred = failure_occurred; > + > + return timer; > +} > + > +static void > +test_timer_inc_ref_cnt(struct test_timer *timer) > +{ > + __atomic_add_fetch(&timer->ref_cnt, 1, __ATOMIC_RELEASE); __atomic_fetch_add instead please there's future work to align with C11 atomics using the previous __atomic_fetch_ is preferred because it just becomes s/__atomic/atomic/ (well mostly...) > +} > + > +static void > +test_timer_dec_ref_cnt(struct test_timer *timer) > +{ > + if (timer != NULL) { > + uint32_t cnt = __atomic_sub_fetch(&timer->ref_cnt, 1, > + __ATOMIC_RELEASE); same here i'll try to get a patch up for checkpatches warning soon. thanks!