From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 5EC8741EA2; Wed, 15 Mar 2023 18:09:56 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id B844142C76; Wed, 15 Mar 2023 18:09:54 +0100 (CET) Received: from EUR05-VI1-obe.outbound.protection.outlook.com (mail-vi1eur05on2082.outbound.protection.outlook.com [40.107.21.82]) by mails.dpdk.org (Postfix) with ESMTP id 7A02842BAC for ; Wed, 15 Mar 2023 18:09:53 +0100 (CET) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Rs5oiGhBemYnIlNNM0Fp+AmFG6PWruBlAIHGs5+KfEfUECKg4ZJqCT/+8cnamkRhzjtWa/Nxlql2/S/xgZWtoqABEdVRVydoDX7vSYcDb2TDNfgj1VVSW27lKHm/w8d5GNKFjsNDyrwRvfzOxSgJzsvb+s0X8JrDOMOlwDbkrL5x64WDSRAd8j0XEe8EkjR6QQjJ/52M0/lRZXo2l07aIY/BfvOeWbOXcl+OdgT15Xpk+FsQT/7T//CXMFOTVqmzdn8reCYYniHOHVB9HUnojN/KAvx58mEYHKUy1nu4+7mkyICN6t67z1ASOlo478+owngV3GXqtezou3yqaLu6KA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=8W0TxLMkJYwaIXTqYhiFuXMpf+liC1mfkEa9Fnx8tPM=; b=MWEy1oPFbEQY23LZaKdML43HPs2yHazFzRVnf8viPIK5fG4IHEtqcCbfXNC6UR7XmslsgLFbE7c868BJXjGBko67yc3Kq74SWUaMFH9LkqP87LQM09dAO/ToWNiUCLBu9podf0nfOSMMAXs3CkO8vcuhzTJMcQpZCccXwHWCGapEs+8O/Qb1M8qNM1RuS0Bpmb3HIDiuyNLqt98ghYUnwIreRfAy1ldsXGxYD+ObUsQQmr6Wbjh2LaiVwnHHa+/5ic0rnrzRSDL5p8/s3j1fNegAP/2f5yrUb4VrTNg0wQcHe1XukJiqUakf3ZQlZwyfku8U9RCDg1mpN2ksjUV13A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 192.176.1.74) smtp.rcpttodomain=dpdk.org smtp.mailfrom=ericsson.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=ericsson.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ericsson.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=8W0TxLMkJYwaIXTqYhiFuXMpf+liC1mfkEa9Fnx8tPM=; b=WJ34yNaYMFO1vBswAUxNH8LuHR7zqFLDoYUp2YXlG9/49uXrZ9DT1HfejjaIptG/2VFMdrf95OJHy5wNCu+L/xsJG2PugsuMKwEMfrVyhODkcBYHoeF052CKMSimGD+N0sjZyT8idw+xuk5t2mmR1M45dLIoghtP/CY0d6++r/w= Received: from AM6P191CA0034.EURP191.PROD.OUTLOOK.COM (2603:10a6:209:8b::47) by PAXPR07MB8387.eurprd07.prod.outlook.com (2603:10a6:102:229::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6178.26; Wed, 15 Mar 2023 17:09:48 +0000 Received: from AM0EUR02FT016.eop-EUR02.prod.protection.outlook.com (2603:10a6:209:8b:cafe::34) by AM6P191CA0034.outlook.office365.com (2603:10a6:209:8b::47) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6178.29 via Frontend Transport; Wed, 15 Mar 2023 17:09:48 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 192.176.1.74) smtp.mailfrom=ericsson.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=ericsson.com; Received-SPF: Pass (protection.outlook.com: domain of ericsson.com designates 192.176.1.74 as permitted sender) receiver=protection.outlook.com; client-ip=192.176.1.74; helo=oa.msg.ericsson.com; pr=C Received: from oa.msg.ericsson.com (192.176.1.74) by AM0EUR02FT016.mail.protection.outlook.com (10.13.54.178) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.20.6178.20 via Frontend Transport; Wed, 15 Mar 2023 17:09:48 +0000 Received: from ESESBMB502.ericsson.se (153.88.183.169) by ESESBMB505.ericsson.se (153.88.183.172) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.2507.17; Wed, 15 Mar 2023 18:09:42 +0100 Received: from seliicinfr00050.seli.gic.ericsson.se (153.88.183.153) by smtp.internal.ericsson.com (153.88.183.185) with Microsoft SMTP Server id 15.1.2507.17 via Frontend Transport; Wed, 15 Mar 2023 18:09:42 +0100 Received: from breslau.. (seliicwb00002.seli.gic.ericsson.se [10.156.25.100]) by seliicinfr00050.seli.gic.ericsson.se (Postfix) with ESMTP id 120361C006A; Wed, 15 Mar 2023 18:09:42 +0100 (CET) From: =?UTF-8?q?Mattias=20R=C3=B6nnblom?= To: CC: Erik Gabriel Carrillo , David Marchand , , Stefan Sundkvist , Stephen Hemminger , =?UTF-8?q?Morten=20Br=C3=B8rup?= , Tyler Retzlaff , =?UTF-8?q?Mattias=20R=C3=B6nnblom?= Subject: [RFC v2 2/2] eal: add high-performance timer facility Date: Wed, 15 Mar 2023 18:03:42 +0100 Message-ID: <20230315170342.214127-3-mattias.ronnblom@ericsson.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230315170342.214127-1-mattias.ronnblom@ericsson.com> References: <20230228093916.87206-1-mattias.ronnblom@ericsson.com> <20230315170342.214127-1-mattias.ronnblom@ericsson.com> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: AM0EUR02FT016:EE_|PAXPR07MB8387:EE_ X-MS-Office365-Filtering-Correlation-Id: 0bc6a83b-934d-4528-d77d-08db25781509 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 5/xdqe0d54y06rifrg/2yKhLAK2R6gvxlmGmYMOkDTVfL3xv/AJwlO5N2I499XOQPr1ibFCBZN/htyVSIMczMmoJgI8/CX3+/Z27+UIfJRgaWTR5OXtOipV05odWZWqUw3R0MkJlgNV25dl9DG5tjuJmmKZKiuYMgE8fB/9YtDKZwEvXgpijtPgUTnum98PKyKGrQRyMU4VHKz7SBgxSxtQ9QQKhpAIZPV2sXPKuCxqQzbhmg5d3RSTZlp3Y+XKzXm02oPgfHjPDh96VTF/Ggszjn1mTnPQWKyAiHCy73ZnJy9/tfNlN1EIRZMv/a5ZAGGPnA6ZA0rRctDaotdrw3NN45qnXwivHOJw4NkZF5xKY+KziOlj/1rjuRnTvlq5fKdriQv7nf69Dm6KLVewLspFZT3kVyeH+9s4wu2DJ6otHzX12JH84rX7wBBybcJqIeRA3gQR+612z48YkKfbaI6uedL4r916tGNqhk1G2gIyWTvlQ1yOJYSSsPEzz/w1XEQgzhavkZJv7s9Vg0Vna4XNLqgnn2EskY9/BjD2AzqfTWmX4zq5H+WtgJt+sLlyPD6DURn2hUzLRBO7XX00MRsriwFaABvmR7rlkzr6XDaD7yKlmLiklSXTYg0LO7DQhD9Ns4xHj+q+MK+2Her1GJAYIbYLa6sXCS+vObPZy6Rors49OIvqqm3CZomSqDQURsXDB7ACSBV9EuPNNcF+HP4yDRmhTiBzlja9IMCz1dlaRCsGxJLsdEDywgC0hdNzR X-Forefront-Antispam-Report: CIP:192.176.1.74; CTRY:SE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:oa.msg.ericsson.com; PTR:office365.se.ericsson.net; CAT:NONE; SFS:(13230025)(4636009)(136003)(376002)(396003)(346002)(39860400002)(451199018)(46966006)(40470700004)(36840700001)(66899018)(2906002)(30864003)(83380400001)(40480700001)(5660300002)(107886003)(1076003)(26005)(36756003)(336012)(66574015)(2616005)(316002)(186003)(6266002)(47076005)(41300700001)(86362001)(8936002)(54906003)(82310400005)(356005)(40460700003)(82960400001)(6916009)(82740400003)(478600001)(8676002)(7636003)(4326008)(36860700001)(70206006)(70586007)(579004)(559001); DIR:OUT; SFP:1101; X-OriginatorOrg: ericsson.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Mar 2023 17:09:48.1545 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 0bc6a83b-934d-4528-d77d-08db25781509 X-MS-Exchange-CrossTenant-Id: 92e84ceb-fbfd-47ab-be52-080c6b87953f X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=92e84ceb-fbfd-47ab-be52-080c6b87953f; Ip=[192.176.1.74]; Helo=[oa.msg.ericsson.com] X-MS-Exchange-CrossTenant-AuthSource: AM0EUR02FT016.eop-EUR02.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAXPR07MB8387 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org The htimer library attempts at providing a timer facility with roughly the same functionality, but less overhead and better scalability than DPDK timer library. The htimer library employs per-lcore hierarchical timer wheels and a message-based synchronization/MT-safety scheme. RFC v2: * Fix spelling. * Fix signed/unsigned comparisons and discontinue the use of name-less function parameters, both of which may result in compiler warnings. * Undo the accidental removal of the bitset tests from the 'fast_tests'. * Add a number of missing include files, causing build failures (e.g., on AArch64 builds). * Add perf test attempting to compare rte_timer, rte_htimer and rte_htw. * Use nanoseconds (instead of TSC) as the default time unit. * add() and manage() has flags which allows the caller to specify the time unit (nanoseconds, TSC, or ticks) for the times provided. Signed-off-by: Mattias Rönnblom --- app/test/meson.build | 8 + app/test/test_htimer_mgr.c | 674 +++++++++++++++++++++++++ app/test/test_htimer_mgr_perf.c | 322 ++++++++++++ app/test/test_htw.c | 478 ++++++++++++++++++ app/test/test_htw_perf.c | 181 +++++++ app/test/test_timer_htimer_htw_perf.c | 693 ++++++++++++++++++++++++++ doc/api/doxy-api-index.md | 5 +- doc/api/doxy-api.conf.in | 1 + lib/htimer/meson.build | 7 + lib/htimer/rte_htimer.h | 68 +++ lib/htimer/rte_htimer_mgr.c | 547 ++++++++++++++++++++ lib/htimer/rte_htimer_mgr.h | 516 +++++++++++++++++++ lib/htimer/rte_htimer_msg.h | 44 ++ lib/htimer/rte_htimer_msg_ring.c | 18 + lib/htimer/rte_htimer_msg_ring.h | 55 ++ lib/htimer/rte_htw.c | 445 +++++++++++++++++ lib/htimer/rte_htw.h | 49 ++ lib/htimer/version.map | 17 + lib/meson.build | 1 + 19 files changed, 4128 insertions(+), 1 deletion(-) create mode 100644 app/test/test_htimer_mgr.c create mode 100644 app/test/test_htimer_mgr_perf.c create mode 100644 app/test/test_htw.c create mode 100644 app/test/test_htw_perf.c create mode 100644 app/test/test_timer_htimer_htw_perf.c create mode 100644 lib/htimer/meson.build create mode 100644 lib/htimer/rte_htimer.h create mode 100644 lib/htimer/rte_htimer_mgr.c create mode 100644 lib/htimer/rte_htimer_mgr.h create mode 100644 lib/htimer/rte_htimer_msg.h create mode 100644 lib/htimer/rte_htimer_msg_ring.c create mode 100644 lib/htimer/rte_htimer_msg_ring.h create mode 100644 lib/htimer/rte_htw.c create mode 100644 lib/htimer/rte_htw.h create mode 100644 lib/htimer/version.map diff --git a/app/test/meson.build b/app/test/meson.build index 03811ff692..d0308ac09d 100644 --- a/app/test/meson.build +++ b/app/test/meson.build @@ -140,9 +140,14 @@ test_sources = files( 'test_thash_perf.c', 'test_threads.c', 'test_timer.c', + 'test_timer_htimer_htw_perf.c', 'test_timer_perf.c', 'test_timer_racecond.c', 'test_timer_secondary.c', + 'test_htimer_mgr.c', + 'test_htimer_mgr_perf.c', + 'test_htw.c', + 'test_htw_perf.c', 'test_ticketlock.c', 'test_trace.c', 'test_trace_register.c', @@ -193,6 +198,7 @@ fast_tests = [ ['fib6_autotest', true, true], ['func_reentrancy_autotest', false, true], ['hash_autotest', true, true], + ['htimer_mgr_autotest', true, true], ['interrupt_autotest', true, true], ['ipfrag_autotest', false, true], ['lcores_autotest', true, true], @@ -265,6 +271,8 @@ perf_test_names = [ 'memcpy_perf_autotest', 'hash_perf_autotest', 'timer_perf_autotest', + 'htimer_mgr_perf_autotest', + 'htw_perf_autotest', 'reciprocal_division', 'reciprocal_division_perf', 'lpm_perf_autotest', diff --git a/app/test/test_htimer_mgr.c b/app/test/test_htimer_mgr.c new file mode 100644 index 0000000000..9e46dec53e --- /dev/null +++ b/app/test/test_htimer_mgr.c @@ -0,0 +1,674 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2023 Ericsson AB + */ + +#include "test.h" + +#include +#include +#include + +#include +#include +#include +#include +#include +#include + +static int +timer_lcore(void *arg) +{ + bool *stop = arg; + + while (!__atomic_load_n(stop, __ATOMIC_RELAXED)) + rte_htimer_mgr_manage(); + + return 0; +} + +static void +count_timer_cb(struct rte_htimer *timer __rte_unused, void *arg) +{ + unsigned int *count = arg; + + __atomic_fetch_add(count, 1, __ATOMIC_RELAXED); +} + +static void +count_async_cb(struct rte_htimer *timer __rte_unused, int result, + void *cb_arg) +{ + unsigned int *count = cb_arg; + + if (result == RTE_HTIMER_MGR_ASYNC_RESULT_ADDED) + __atomic_fetch_add(count, 1, __ATOMIC_RELAXED); +} + +static uint64_t +s_to_tsc(double s) +{ + return s * rte_get_tsc_hz(); +} + +#define ASYNC_ADD_TEST_EXPIRATION_TIME (250*1000) /* ns */ +#define ASYNC_TEST_TICK (10*1000) /* ns */ + +static int +test_htimer_mgr_async_add(unsigned int num_timers_per_lcore) +{ + struct rte_htimer *timers; + unsigned int timer_idx; + unsigned int lcore_id; + bool stop = false; + unsigned int timeout_count = 0; + unsigned int async_count = 0; + unsigned int num_workers = 0; + uint64_t expiration_time; + unsigned int num_total_timers; + + rte_htimer_mgr_init(ASYNC_TEST_TICK); + + RTE_LCORE_FOREACH_WORKER(lcore_id) { + if (rte_eal_remote_launch(timer_lcore, &stop, lcore_id) != 0) + rte_panic("Unable to launch timer lcore\n"); + num_workers++; + } + + num_total_timers = num_workers * num_timers_per_lcore; + + timers = malloc(num_total_timers * sizeof(struct rte_htimer)); + timer_idx = 0; + + if (timers == NULL) + rte_panic("Unable to allocate heap memory\n"); + + expiration_time = ASYNC_ADD_TEST_EXPIRATION_TIME; + + RTE_LCORE_FOREACH_WORKER(lcore_id) { + unsigned int i; + + for (i = 0; i < num_timers_per_lcore; i++) { + struct rte_htimer *timer = &timers[timer_idx++]; + + for (;;) { + int rc; + + rc = rte_htimer_mgr_async_add(timer, lcore_id, + expiration_time, + RTE_HTIMER_FLAG_TIME_TSC, + count_timer_cb, + &timeout_count, 0, + count_async_cb, + &async_count); + if (unlikely(rc == -EBUSY)) + rte_htimer_mgr_process(); + else + break; + } + } + } + + while (__atomic_load_n(&async_count, __ATOMIC_RELAXED) != + num_total_timers || + __atomic_load_n(&timeout_count, __ATOMIC_RELAXED) != + num_total_timers) + rte_htimer_mgr_manage(); + + __atomic_store_n(&stop, true, __ATOMIC_RELAXED); + + rte_eal_mp_wait_lcore(); + + rte_htimer_mgr_deinit(); + + free(timers); + + return TEST_SUCCESS; +} + +struct async_recorder_state { + bool timer_cb_run; + bool async_add_cb_run; + bool async_cancel_cb_run; + bool failed; +}; + +static void +record_async_add_cb(struct rte_htimer *timer __rte_unused, + int result, void *cb_arg) +{ + struct async_recorder_state *state = cb_arg; + + if (state->failed) + return; + + if (state->async_add_cb_run || + result != RTE_HTIMER_MGR_ASYNC_RESULT_ADDED) { + puts("async add run already"); + state->failed = true; + } + + state->async_add_cb_run = true; +} + +static void +record_async_cancel_cb(struct rte_htimer *timer __rte_unused, + int result, void *cb_arg) +{ + struct async_recorder_state *state = cb_arg; + + if (state->failed) + return; + + if (state->async_cancel_cb_run) { + state->failed = true; + return; + } + + switch (result) { + case RTE_HTIMER_MGR_ASYNC_RESULT_EXPIRED: + if (!state->timer_cb_run) + state->failed = true; + break; + case RTE_HTIMER_MGR_ASYNC_RESULT_CANCELED: + if (state->timer_cb_run) + state->failed = true; + break; + case RTE_HTIMER_MGR_ASYNC_RESULT_ALREADY_CANCELED: + state->failed = true; + } + + state->async_cancel_cb_run = true; +} + +static int +record_check_consistency(struct async_recorder_state *state) +{ + if (state->failed) + return -1; + + return state->async_cancel_cb_run ? 1 : 0; +} + +static int +records_check_consistency(struct async_recorder_state *states, + unsigned int num_states) +{ + unsigned int i; + int canceled = 0; + + for (i = 0; i < num_states; i++) { + int rc; + + rc = record_check_consistency(&states[i]); + + if (rc < 0) + return -1; + canceled += rc; + } + + return canceled; +} + +static void +log_timer_expiry_cb(struct rte_htimer *timer __rte_unused, + void *arg) +{ + bool *timer_run = arg; + + *timer_run = true; +} + + +#define ASYNC_ADD_CANCEL_TEST_EXPIRATION_TIME_MAX 10e-3 /* s */ + +static int +test_htimer_mgr_async_add_cancel(unsigned int num_timers_per_lcore) +{ + struct rte_htimer *timers; + struct async_recorder_state *recorder_states; + unsigned int timer_idx = 0; + unsigned int lcore_id; + uint64_t now; + unsigned int num_workers = 0; + bool stop = false; + uint64_t max_expiration_time = + s_to_tsc(ASYNC_ADD_CANCEL_TEST_EXPIRATION_TIME_MAX); + unsigned int num_total_timers; + int canceled = 0; + + rte_htimer_mgr_init(ASYNC_TEST_TICK); + + RTE_LCORE_FOREACH_WORKER(lcore_id) { + if (rte_eal_remote_launch(timer_lcore, &stop, lcore_id) != 0) + rte_panic("Unable to launch timer lcore\n"); + num_workers++; + } + + num_total_timers = num_workers * num_timers_per_lcore; + + timers = malloc(num_total_timers * sizeof(struct rte_htimer)); + recorder_states = + malloc(num_total_timers * sizeof(struct async_recorder_state)); + + if (timers == NULL || recorder_states == NULL) + rte_panic("Unable to allocate heap memory\n"); + + now = rte_get_tsc_cycles(); + + RTE_LCORE_FOREACH_WORKER(lcore_id) { + unsigned int i; + + for (i = 0; i < num_timers_per_lcore; i++) { + struct rte_htimer *timer = &timers[timer_idx]; + struct async_recorder_state *state = + &recorder_states[timer_idx]; + + timer_idx++; + + *state = (struct async_recorder_state) {}; + + uint64_t expiration_time = + now + rte_rand_max(max_expiration_time); + + for (;;) { + int rc; + + rc = rte_htimer_mgr_async_add(timer, lcore_id, + expiration_time, + 0, + log_timer_expiry_cb, + &state->timer_cb_run, + 0, + record_async_add_cb, + state); + + if (unlikely(rc == -EBUSY)) + rte_htimer_mgr_process(); + else + break; + } + } + } + + timer_idx = 0; + + RTE_LCORE_FOREACH_WORKER(lcore_id) { + unsigned int i; + + for (i = 0; i < num_timers_per_lcore; i++) { + struct rte_htimer *timer = &timers[timer_idx]; + struct async_recorder_state *state = + &recorder_states[timer_idx]; + + timer_idx++; + + /* cancel roughly half of the timers */ + if (rte_rand_max(2) == 0) + continue; + + for (;;) { + int rc; + + rc = rte_htimer_mgr_async_cancel(timer, + record_async_cancel_cb, + state); + + if (unlikely(rc == -EBUSY)) { + puts("busy"); + rte_htimer_mgr_process(); + } else + break; + } + + canceled++; + } + } + + for (;;) { + int cancel_completed; + + cancel_completed = records_check_consistency(recorder_states, + num_total_timers); + + if (cancel_completed < 0) { + puts("Inconstinency found"); + return TEST_FAILED; + } + + if (cancel_completed == canceled) + break; + + rte_htimer_mgr_process(); + } + + __atomic_store_n(&stop, true, __ATOMIC_RELAXED); + + rte_eal_mp_wait_lcore(); + + rte_htimer_mgr_deinit(); + + free(timers); + free(recorder_states); + + return TEST_SUCCESS; +} + +/* + * This is a test case where one thread asynchronously adds two timers, + * with the same expiration time; one on the local lcore and one on a + * remote lcore. This creates a tricky situation for the timer + * manager, and for the application as well, if the htimer struct is + * dynamically allocated. + */ + +struct test_timer { + uint32_t ref_cnt; + uint64_t expiration_time; /* in TSC, not tick */ + uint32_t *timeout_count; + bool *failure_occurred; + struct rte_htimer htimer; +}; + + +static struct test_timer * +test_timer_create(uint64_t expiration_time, uint32_t *timeout_count, + bool *failure_occurred) +{ + struct test_timer *timer; + + timer = malloc(sizeof(struct test_timer)); + + if (timer == NULL) + rte_panic("Unable to allocate timer memory\n"); + + timer->ref_cnt = 1; + timer->expiration_time = expiration_time; + timer->timeout_count = timeout_count; + timer->failure_occurred = failure_occurred; + + return timer; +} + +static void +test_timer_inc_ref_cnt(struct test_timer *timer) +{ + __atomic_add_fetch(&timer->ref_cnt, 1, __ATOMIC_RELEASE); +} + +static void +test_timer_dec_ref_cnt(struct test_timer *timer) +{ + if (timer != NULL) { + uint32_t cnt = __atomic_sub_fetch(&timer->ref_cnt, 1, + __ATOMIC_RELEASE); + if (cnt == 0) + free(timer); + } +} + +static void +test_timer_cb(struct rte_htimer *timer, void *arg __rte_unused) +{ + struct test_timer *test_timer = + container_of(timer, struct test_timer, htimer); + uint64_t now = rte_get_tsc_cycles(); + + if (now < test_timer->expiration_time) + *(test_timer->failure_occurred) = true; + + __atomic_fetch_add(test_timer->timeout_count, 1, __ATOMIC_RELAXED); + + test_timer_dec_ref_cnt(test_timer); +} + +static int +worker_lcore(void *arg) +{ + bool *stop = arg; + + while (!__atomic_load_n(stop, __ATOMIC_RELAXED)) + rte_htimer_mgr_manage(); + + return 0; +} + +struct cancel_timer { + bool cancel; + struct rte_htimer *target_timer; + uint32_t *cancel_count; + uint32_t *expired_count; + bool *failure_occurred; + struct rte_htimer htimer; +}; + +static struct cancel_timer * +cancel_timer_create(bool cancel, struct rte_htimer *target_timer, + uint32_t *cancel_count, uint32_t *expired_count, + bool *failure_occurred) +{ + struct cancel_timer *timer; + + timer = malloc(sizeof(struct cancel_timer)); + + if (timer == NULL) + rte_panic("Unable to allocate timer memory\n"); + + timer->cancel = cancel; + timer->target_timer = target_timer; + timer->cancel_count = cancel_count; + timer->expired_count = expired_count; + timer->failure_occurred = failure_occurred; + + return timer; +} + +static void +async_cancel_cb(struct rte_htimer *timer, int result, void *cb_arg) +{ + struct test_timer *test_timer = + container_of(timer, struct test_timer, htimer); + struct cancel_timer *cancel_timer = cb_arg; + bool *failure_occurred = cancel_timer->failure_occurred; + + if (!cancel_timer->cancel || cancel_timer->target_timer != timer) + *failure_occurred = true; + + if (result == RTE_HTIMER_MGR_ASYNC_RESULT_CANCELED) { + uint32_t *cancel_count = cancel_timer->cancel_count; + + /* decrease target lcore's ref count */ + test_timer_dec_ref_cnt(test_timer); + (*cancel_count)++; + } else if (result == RTE_HTIMER_MGR_ASYNC_RESULT_EXPIRED) { + uint32_t *expired_count = cancel_timer->expired_count; + + (*expired_count)++; + } else + *failure_occurred = true; + + /* source lcore's ref count */ + test_timer_dec_ref_cnt(test_timer); + + free(cancel_timer); +} + +static void +cancel_timer_cb(struct rte_htimer *timer, void *arg __rte_unused) +{ + struct cancel_timer *cancel_timer = + container_of(timer, struct cancel_timer, htimer); + + if (cancel_timer->cancel) { + int rc; + + rc = rte_htimer_mgr_async_cancel(cancel_timer->target_timer, + async_cancel_cb, cancel_timer); + + if (rc == -EBUSY) + rte_htimer_mgr_add(timer, 0, 0, cancel_timer_cb, + NULL, 0); + } else + free(cancel_timer); +} + +#define REF_CNT_TEST_TICK 10 /* ns */ +#define REF_CNT_AVG_EXPIRATION_TIME (50 * 1000) /* ns */ +#define REF_CNT_MAX_EXPIRATION_TIME (2 * REF_CNT_AVG_EXPIRATION_TIME) +#define REF_CNT_CANCEL_FUZZ(expiration_time) \ + ((uint64_t)((expiration_time) * (rte_drand()/10 + 0.95))) + +static int +test_htimer_mgr_ref_cnt_timers(unsigned int num_timers_per_lcore) +{ + unsigned int lcore_id; + bool stop = false; + unsigned int num_workers = 0; + struct test_timer **timers; + struct cancel_timer **cancel_timers; + unsigned int num_timers; + uint32_t timeout_count = 0; + uint32_t cancel_count = 0; + uint32_t expired_count = 0; + bool failure_occurred = false; + unsigned int timer_idx; + unsigned int expected_cancel_attempts; + uint64_t deadline; + uint64_t now; + + rte_htimer_mgr_init(REF_CNT_TEST_TICK); + + RTE_LCORE_FOREACH_WORKER(lcore_id) { + if (rte_eal_remote_launch(worker_lcore, &stop, lcore_id) != 0) + rte_panic("Unable to launch timer lcore\n"); + num_workers++; + } + + /* give the workers a chance to get going */ + rte_delay_us_block(10*1000); + + num_timers = num_timers_per_lcore * num_workers; + + timers = malloc(sizeof(struct test_timer *) * num_timers); + cancel_timers = malloc(sizeof(struct cancel_timer *) * num_timers); + + if (timers == NULL || cancel_timers == NULL) + rte_panic("Unable to allocate memory\n"); + + timer_idx = 0; + expected_cancel_attempts = 0; + + RTE_LCORE_FOREACH_WORKER(lcore_id) { + unsigned int i; + + for (i = 0; i < num_timers_per_lcore; i++) { + uint64_t expiration_time; + struct test_timer *timer; + struct rte_htimer *htimer; + bool cancel; + struct cancel_timer *cancel_timer; + uint64_t cancel_expiration_time; + + expiration_time = + REF_CNT_MAX_EXPIRATION_TIME * rte_drand(); + + timer = test_timer_create(expiration_time, + &timeout_count, + &failure_occurred); + htimer = &timer->htimer; + + timers[timer_idx++] = timer; + + /* for the target lcore's usage of this time */ + test_timer_inc_ref_cnt(timer); + + for (;;) { + int rc; + + rc = rte_htimer_mgr_async_add(htimer, lcore_id, + expiration_time, + 0, test_timer_cb, + NULL, 0, NULL, + NULL); + if (unlikely(rc == -EBUSY)) + rte_htimer_mgr_process(); + else + break; + } + + cancel = rte_rand_max(2); + + cancel_timer = + cancel_timer_create(cancel, &timer->htimer, + &cancel_count, + &expired_count, + &failure_occurred); + + cancel_expiration_time = + REF_CNT_CANCEL_FUZZ(expiration_time); + + rte_htimer_mgr_add(&cancel_timer->htimer, + cancel_expiration_time, 0, + cancel_timer_cb, NULL, 0); + + if (cancel) + expected_cancel_attempts++; + } + } + + deadline = rte_get_tsc_cycles() + REF_CNT_MAX_EXPIRATION_TIME + + s_to_tsc(0.25); + + do { + now = rte_get_tsc_cycles(); + + rte_htimer_mgr_manage_time(now, RTE_HTIMER_FLAG_TIME_TSC); + + } while (now < deadline); + + __atomic_store_n(&stop, true, __ATOMIC_RELAXED); + + rte_eal_mp_wait_lcore(); + + if (failure_occurred) + return TEST_FAILED; + + if ((cancel_count + expired_count) != expected_cancel_attempts) + return TEST_FAILED; + + if (timeout_count != (num_timers - cancel_count)) + return TEST_FAILED; + + rte_htimer_mgr_deinit(); + + return TEST_SUCCESS; +} + +static int +test_htimer_mgr(void) +{ + int rc; + + rc = test_htimer_mgr_async_add(1); + if (rc != TEST_SUCCESS) + return rc; + + rc = test_htimer_mgr_async_add(100000); + if (rc != TEST_SUCCESS) + return rc; + + rc = test_htimer_mgr_async_add_cancel(100); + if (rc != TEST_SUCCESS) + return rc; + + rc = test_htimer_mgr_ref_cnt_timers(10); + if (rc != TEST_SUCCESS) + return rc; + + rc = test_htimer_mgr_ref_cnt_timers(10000); + if (rc != TEST_SUCCESS) + return rc; + + return TEST_SUCCESS; +} + +REGISTER_TEST_COMMAND(htimer_mgr_autotest, test_htimer_mgr); diff --git a/app/test/test_htimer_mgr_perf.c b/app/test/test_htimer_mgr_perf.c new file mode 100644 index 0000000000..cdc513228f --- /dev/null +++ b/app/test/test_htimer_mgr_perf.c @@ -0,0 +1,322 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2023 Ericsson AB + */ + +#include "test.h" + +#include +#include +#include + +#include +#include +#include +#include +#include +#include + +static void +nop_cb(struct rte_htimer *timer __rte_unused, void *cb_arg __rte_unused) +{ +} + +static uint64_t +add_rand_timers(struct rte_htimer *timers, uint64_t num, + uint64_t timeout_start, uint64_t max_timeout) +{ + uint64_t i; + uint64_t expiration_times[num]; + uint64_t start_ts; + uint64_t end_ts; + + for (i = 0; i < num; i++) + expiration_times[i] = + 1 + timeout_start + rte_rand_max(max_timeout - 1); + + start_ts = rte_get_tsc_cycles(); + + for (i = 0; i < num; i++) + rte_htimer_mgr_add(&timers[i], expiration_times[i], 0, nop_cb, + NULL, RTE_HTIMER_FLAG_ABSOLUTE_TIME); + + /* make sure the timers are actually scheduled in the wheel */ + rte_htimer_mgr_process(); + + end_ts = rte_get_tsc_cycles(); + + return end_ts - start_ts; +} + +#define TIME_STEP 16 + +static void +test_add_manage_perf(const char *scenario_name, uint64_t num_timers, + uint64_t timespan) +{ + uint64_t manage_calls; + struct rte_htimer *timers; + uint64_t start; + uint64_t now; + uint64_t start_ts; + uint64_t end_ts; + uint64_t add_latency; + uint64_t manage_latency; + + rte_htimer_mgr_init(1); + + manage_calls = timespan / TIME_STEP; + + printf("Scenario: %s\n", scenario_name); + printf(" Configuration:\n"); + printf(" Timers: %"PRIu64"\n", num_timers); + printf(" Max timeout: %"PRIu64" ticks\n", timespan); + printf(" Average timeouts/manage call: %.3f\n", + num_timers / (double)manage_calls); + printf(" Time advance per manage call: %d\n", TIME_STEP); + + printf(" Results:\n"); + + timers = rte_malloc(NULL, sizeof(struct rte_htimer) * num_timers, 0); + + if (timers == NULL) + rte_panic("Unable to allocate memory\n"); + + start = 1 + rte_rand_max(UINT64_MAX / 2); + + rte_htimer_mgr_manage_time(start - 1, 0); + + add_latency = add_rand_timers(timers, num_timers, start, timespan); + + start_ts = rte_get_tsc_cycles(); + + for (now = start; now < (start + timespan); now += TIME_STEP) + rte_htimer_mgr_manage_time(now, 0); + + end_ts = rte_get_tsc_cycles(); + + manage_latency = end_ts - start_ts; + + printf(" %.0f TSC cycles / add op\n", + (double)add_latency / num_timers); + printf(" %.0f TSC cycles / manage call\n", + (double)manage_latency / manage_calls); + printf(" %.1f TSC cycles / tick\n", + (double)manage_latency / timespan); + + rte_htimer_mgr_deinit(); + + rte_free(timers); +} + +static uint64_t +s_to_tsc(double s) +{ + return s * rte_get_tsc_hz(); +} + +static double +tsc_to_s(uint64_t tsc) +{ + return (double)tsc / (double)rte_get_tsc_hz(); +} + +#define ITERATIONS 500 + +static int +test_del_perf(uint64_t num_timers, uint64_t timespan) +{ + struct rte_htimer *timers; + uint64_t start; + uint64_t i, j; + uint64_t start_ts; + uint64_t end_ts; + uint64_t latency = 0; + + rte_htimer_mgr_init(1); + + timers = + rte_malloc(NULL, sizeof(struct rte_htimer) * num_timers, 0); + + if (timers == NULL) + rte_panic("Unable to allocate memory\n"); + + start = 1 + rte_rand_max(UINT64_MAX / 2); + + for (i = 0; i < ITERATIONS; i++) { + rte_htimer_mgr_manage_time(start - 1, 0); + + add_rand_timers(timers, num_timers, start, timespan); + + /* A manage (or process) call is required to get all + * timers scheduled, which may in turn make them a + * little more expensive to remove. + */ + rte_htimer_mgr_manage_time(start, 0); + + start_ts = rte_get_tsc_cycles(); + + for (j = 0; j < num_timers; j++) + if (rte_htimer_mgr_cancel(&timers[j]) < 0) + return TEST_FAILED; + + end_ts = rte_get_tsc_cycles(); + + latency += (end_ts - start_ts); + + start += (timespan + 1); + } + + printf("Timer delete: %.0f TSC cycles / call\n", + (double)latency / (double)ITERATIONS / (double)num_timers); + + rte_htimer_mgr_deinit(); + + rte_free(timers); + + return TEST_SUCCESS; +} + +static int +target_lcore(void *arg) +{ + bool *stop = arg; + + while (!__atomic_load_n(stop, __ATOMIC_RELAXED)) + rte_htimer_mgr_manage(); + + return 0; +} + +static void +count_async_cb(struct rte_htimer *timer __rte_unused, int result, + void *cb_arg) +{ + unsigned int *count = cb_arg; + + if (result == RTE_HTIMER_MGR_ASYNC_RESULT_ADDED) + (*count)++; +} + +#define ASYNC_ADD_TEST_TICK s_to_tsc(500e-9) +/* + * The number of test timers must be kept less than size of the + * htimer-internal message ring for this test case to work. + */ +#define ASYNC_ADD_TEST_NUM_TIMERS 1000 +#define ASYNC_ADD_TEST_MIN_TIMEOUT (ASYNC_ADD_TEST_NUM_TIMERS * s_to_tsc(1e-6)) +#define ASYNC_ADD_TEST_MAX_TIMEOUT (2 * ASYNC_ADD_TEST_MIN_TIMEOUT) + +static void +test_async_add_perf(void) +{ + uint64_t max_timeout = ASYNC_ADD_TEST_MAX_TIMEOUT; + uint64_t min_timeout = ASYNC_ADD_TEST_MIN_TIMEOUT; + unsigned int num_timers = ASYNC_ADD_TEST_NUM_TIMERS; + struct rte_htimer *timers; + bool *stop; + unsigned int lcore_id = rte_lcore_id(); + unsigned int target_lcore_id = + rte_get_next_lcore(lcore_id, true, true); + uint64_t now; + uint64_t request_latency = 0; + uint64_t response_latency = 0; + unsigned int i; + + rte_htimer_mgr_init(ASYNC_ADD_TEST_TICK); + + timers = rte_malloc(NULL, sizeof(struct rte_htimer) * num_timers, + RTE_CACHE_LINE_SIZE); + stop = rte_malloc(NULL, sizeof(bool), RTE_CACHE_LINE_SIZE); + + if (timers == NULL || stop == NULL) + rte_panic("Unable to allocate memory\n"); + + *stop = false; + + if (rte_eal_remote_launch(target_lcore, stop, target_lcore_id) != 0) + rte_panic("Unable to launch worker lcore\n"); + + /* wait for launch to complete */ + rte_delay_us_block(100); + + for (i = 0; i < ITERATIONS; i++) { + uint64_t expiration_times[num_timers]; + unsigned int j; + uint64_t start_ts; + uint64_t end_ts; + unsigned int count = 0; + + now = rte_get_tsc_cycles(); + + for (j = 0; j < num_timers; j++) + expiration_times[j] = now + min_timeout + + rte_rand_max(max_timeout - min_timeout); + + start_ts = rte_get_tsc_cycles(); + + for (j = 0; j < num_timers; j++) + rte_htimer_mgr_async_add(&timers[j], target_lcore_id, + expiration_times[j], 0, + nop_cb, NULL, + RTE_HTIMER_FLAG_ABSOLUTE_TIME, + count_async_cb, &count); + + end_ts = rte_get_tsc_cycles(); + + request_latency += (end_ts - start_ts); + + /* wait long-enough for the target lcore to answered */ + rte_delay_us_block(1 * num_timers); + + start_ts = rte_get_tsc_cycles(); + + while (count != num_timers) + rte_htimer_mgr_process(); + + end_ts = rte_get_tsc_cycles(); + + response_latency += (end_ts - start_ts); + + /* wait until all timeouts have fired */ + rte_delay_us_block(tsc_to_s(max_timeout) * 1e6); + } + + __atomic_store_n(stop, true, __ATOMIC_RELAXED); + + rte_eal_mp_wait_lcore(); + + rte_free(timers); + + rte_htimer_mgr_deinit(); + + printf("Timer async add:\n"); + printf(" Configuration:\n"); + printf(" Timers: %d\n", ASYNC_ADD_TEST_NUM_TIMERS); + printf(" Results:\n"); + printf(" Source lcore cost: %.0f TSC cycles / add request\n", + (double)request_latency / (double)ITERATIONS / num_timers); + printf(" %.0f TSC cycles / add response\n", + (double)response_latency / (double)ITERATIONS / num_timers); +} + +static int +test_htimer_mgr_perf(void) +{ + /* warm up */ + rte_delay_us_block(10000); + + test_add_manage_perf("Sparse", 100000, 10000000); + + test_add_manage_perf("Dense", 100000, 200000); + + test_add_manage_perf("Idle", 10, 100000); + + if (test_del_perf(100000, 100000) != TEST_SUCCESS) + return TEST_FAILED; + + test_async_add_perf(); + + return TEST_SUCCESS; +} + +REGISTER_TEST_COMMAND(htimer_mgr_perf_autotest, test_htimer_mgr_perf); diff --git a/app/test/test_htw.c b/app/test/test_htw.c new file mode 100644 index 0000000000..3cddfaed7f --- /dev/null +++ b/app/test/test_htw.c @@ -0,0 +1,478 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2023 Ericsson AB + */ + +#include "test.h" + +#include +#include +#include + +#include +#include +#include + +struct recorder { + struct rte_htimer_list timeout_list; + uint64_t num_timeouts; +}; + +static void +recorder_init(struct recorder *recorder) +{ + recorder->num_timeouts = 0; + LIST_INIT(&recorder->timeout_list); +} + +static void +recorder_cb(struct rte_htimer *timer, void *arg) +{ + struct recorder *recorder = arg; + + recorder->num_timeouts++; + + LIST_INSERT_HEAD(&recorder->timeout_list, timer, entry); +} + +static int +recorder_verify(struct recorder *recorder, uint64_t min_expiry, + uint64_t max_expiry) +{ + struct rte_htimer *timer; + + LIST_FOREACH(timer, &recorder->timeout_list, entry) { + if (timer->expiration_time > max_expiry) + return TEST_FAILED; + + if (timer->expiration_time < min_expiry) + return TEST_FAILED; + } + + return TEST_SUCCESS; +} + +static void +add_rand_timers(struct rte_htw *htw, struct rte_htimer *timers, + uint64_t num, uint64_t timeout_start, uint64_t max_timeout, + rte_htimer_cb_t cb, void *cb_arg) +{ + uint64_t i; + + for (i = 0; i < num; i++) { + struct rte_htimer *timer = &timers[i]; + bool use_absolute = rte_rand() & 1; + unsigned int flags = 0; + uint64_t expiration_time; + + expiration_time = timeout_start + rte_rand_max(max_timeout); + + if (use_absolute) + flags |= RTE_HTIMER_FLAG_ABSOLUTE_TIME; + else { + uint64_t htw_current_time; + + htw_current_time = rte_htw_current_time(htw); + + if (expiration_time < htw_current_time) + expiration_time = 0; + else + expiration_time -= htw_current_time; + } + + rte_htw_add(htw, timer, expiration_time, 0, cb, cb_arg, flags); + } +} + +#define ADVANCE_TIME_MAX_STEP 16 + +static int +test_rand_timers(uint64_t in_flight_timers, uint64_t max_timeout, + uint64_t runtime) +{ + struct recorder recorder; + struct rte_htimer *timers; + uint64_t fired = 0; + uint64_t start; + uint64_t now; + struct rte_htw *htw; + uint64_t added; + + recorder_init(&recorder); + + timers = malloc(sizeof(struct rte_htimer) * in_flight_timers); + + if (timers == NULL) + rte_panic("Unable to allocate heap memory\n"); + + start = rte_rand_max(UINT64_MAX - max_timeout); + + htw = rte_htw_create(); + + if (htw == NULL) + return TEST_FAILED; + + added = in_flight_timers; + add_rand_timers(htw, timers, added, start + 1, max_timeout, + recorder_cb, &recorder); + + for (now = start; now < (start + runtime); ) { + uint64_t advance; + + advance = rte_rand_max(ADVANCE_TIME_MAX_STEP); + + now += advance; + + rte_htw_manage(htw, now); + + if (recorder.num_timeouts > 0) { + struct rte_htimer *timer; + + if (advance == 0) + return TEST_FAILED; + + if (recorder_verify(&recorder, now - advance + 1, now) + != TEST_SUCCESS) + return TEST_FAILED; + + while ((timer = LIST_FIRST(&recorder.timeout_list)) + != NULL) { + LIST_REMOVE(timer, entry); + + add_rand_timers(htw, timer, 1, + now + 1, max_timeout, + recorder_cb, &recorder); + added++; + fired++; + } + + recorder.num_timeouts = 0; + } + } + + /* finish the remaining timeouts */ + + rte_htw_manage(htw, now + max_timeout); + + if (recorder_verify(&recorder, now, now + max_timeout) != TEST_SUCCESS) + return TEST_FAILED; + fired += recorder.num_timeouts; + + if (fired != added) + return TEST_FAILED; + + rte_htw_destroy(htw); + + free(timers); + + return TEST_SUCCESS; +} + +struct counter_state { + int calls; + struct rte_htw *htw; + bool cancel; +}; + +static void +count_timeouts_cb(struct rte_htimer *timer __rte_unused, void *arg) +{ + struct counter_state *state = arg; + + state->calls++; + + if (state->cancel) + rte_htw_cancel(state->htw, timer); +} + +static int +test_single_timeout_type(uint64_t now, uint64_t distance, bool use_absolute) +{ + struct rte_htw *htw; + struct counter_state cstate = {}; + struct rte_htimer timer; + uint64_t expiration_time; + unsigned int flags = 0; + + htw = rte_htw_create(); + + rte_htw_manage(htw, now); + + if (use_absolute) { + expiration_time = now + distance; + flags |= RTE_HTIMER_FLAG_ABSOLUTE_TIME; + } else + expiration_time = distance; + + rte_htw_add(htw, &timer, expiration_time, 0, count_timeouts_cb, + &cstate, flags); + + rte_htw_manage(htw, now); + + if (cstate.calls != 0) + return TEST_FAILED; + + rte_htw_manage(htw, now + distance - 1); + + if (cstate.calls != 0) + return TEST_FAILED; + + rte_htw_manage(htw, now + distance); + + + if (cstate.calls != 1) + return TEST_FAILED; + + rte_htw_manage(htw, now + distance); + + if (cstate.calls != 1) + return TEST_FAILED; + + rte_htw_manage(htw, now + distance + 1); + + if (cstate.calls != 1) + return TEST_FAILED; + + rte_htw_destroy(htw); + + return TEST_SUCCESS; +} + +static int +test_single_timeout(uint64_t now, uint64_t distance) +{ + + int rc; + + rc = test_single_timeout_type(now, distance, true); + if (rc < 0) + return rc; + + rc = test_single_timeout_type(now, distance, false); + if (rc < 0) + return rc; + + return TEST_SUCCESS; +} + +static int +test_periodical_timer(uint64_t now, uint64_t start, uint64_t period) +{ + struct rte_htw *htw; + struct counter_state cstate; + struct rte_htimer timer; + + htw = rte_htw_create(); + + cstate = (struct counter_state) { + .htw = htw + }; + + rte_htw_manage(htw, now); + + rte_htw_add(htw, &timer, start, period, count_timeouts_cb, + &cstate, RTE_HTIMER_FLAG_PERIODICAL); + + rte_htw_manage(htw, now); + + if (cstate.calls != 0) + return TEST_FAILED; + + rte_htw_manage(htw, now + start - 1); + + if (cstate.calls != 0) + return TEST_FAILED; + + rte_htw_manage(htw, now + start); + + if (cstate.calls != 1) + return TEST_FAILED; + + rte_htw_manage(htw, now + start + 1); + + if (cstate.calls != 1) + return TEST_FAILED; + + rte_htw_manage(htw, now + start + period); + + if (cstate.calls != 2) + return TEST_FAILED; + + cstate.cancel = true; + + rte_htw_manage(htw, now + start + 2 * period); + + if (cstate.calls != 3) + return TEST_FAILED; + + rte_htw_manage(htw, now + start + 3 * period); + + if (cstate.calls != 3) + return TEST_FAILED; + + rte_htw_destroy(htw); + + return TEST_SUCCESS; +} + +#define CANCEL_ITERATIONS 1000 +#define CANCEL_NUM_TIMERS 1000 +#define CANCEL_MAX_DISTANCE 10000 + +static int +test_cancel_timer(void) +{ + uint64_t now; + struct rte_htw *htw; + int i; + struct rte_htimer timers[CANCEL_NUM_TIMERS]; + struct counter_state timeouts[CANCEL_NUM_TIMERS]; + + now = rte_rand_max(UINT64_MAX / 2); + + htw = rte_htw_create(); + + for (i = 0; i < CANCEL_ITERATIONS; i++) { + int j; + int target; + + for (j = 0; j < CANCEL_NUM_TIMERS; j++) { + struct rte_htimer *timer = &timers[j]; + uint64_t expiration_time; + + timeouts[j] = (struct counter_state) {}; + + expiration_time = now + 1 + + rte_rand_max(CANCEL_MAX_DISTANCE); + + rte_htw_add(htw, timer, expiration_time, 0, + count_timeouts_cb, &timeouts[j], + RTE_HTIMER_FLAG_ABSOLUTE_TIME); + } + + target = rte_rand_max(CANCEL_NUM_TIMERS); + + rte_htw_cancel(htw, &timers[target]); + + now += CANCEL_MAX_DISTANCE; + + rte_htw_manage(htw, now); + + for (j = 0; j < CANCEL_NUM_TIMERS; j++) { + if (j != target) { + if (timeouts[j].calls != 1) + return TEST_FAILED; + } else { + if (timeouts[j].calls > 0) + return TEST_FAILED; + } + } + } + + rte_htw_destroy(htw); + + return TEST_SUCCESS; +} + +static void +nop_cb(struct rte_htimer *timer __rte_unused, void *arg __rte_unused) +{ +} + +#define NEXT_NUM_TIMERS 1000 +#define NEXT_MAX_DISTANCE 10000 + +static int +test_next_timeout(void) +{ + uint64_t now; + struct rte_htw *htw; + int i; + struct rte_htimer timers[NEXT_NUM_TIMERS]; + uint64_t last_expiration; + + now = rte_rand_max(NEXT_MAX_DISTANCE); + + htw = rte_htw_create(); + + if (rte_htw_next_timeout(htw, UINT64_MAX) != UINT64_MAX) + return TEST_FAILED; + if (rte_htw_next_timeout(htw, now + 1) != (now + 1)) + return TEST_FAILED; + + rte_htw_manage(htw, now); + + last_expiration = now + NEXT_MAX_DISTANCE * NEXT_NUM_TIMERS; + + for (i = 0; i < NEXT_NUM_TIMERS; i++) { + struct rte_htimer *timer = &timers[i]; + uint64_t expiration; + uint64_t upper_bound; + + /* add timers, each new one closer than the last */ + + expiration = last_expiration - rte_rand_max(NEXT_MAX_DISTANCE); + + rte_htw_add(htw, timer, expiration, 0, nop_cb, NULL, + RTE_HTIMER_FLAG_ABSOLUTE_TIME); + + if (rte_htw_next_timeout(htw, UINT64_MAX) != expiration) + return TEST_FAILED; + + upper_bound = expiration + rte_rand_max(100000); + + if (rte_htw_next_timeout(htw, upper_bound) != expiration) + return TEST_FAILED; + + upper_bound = expiration - rte_rand_max(expiration); + + if (rte_htw_next_timeout(htw, upper_bound) != upper_bound) + return TEST_FAILED; + + last_expiration = expiration; + } + + rte_htw_destroy(htw); + + return TEST_SUCCESS; +} + +static int +test_htw(void) +{ + if (test_single_timeout(0, 10) != TEST_SUCCESS) + return TEST_FAILED; + + if (test_single_timeout(0, 254) != TEST_SUCCESS) + return TEST_FAILED; + + if (test_single_timeout(0, 255) != TEST_SUCCESS) + return TEST_FAILED; + + if (test_single_timeout(255, 1) != TEST_SUCCESS) + return TEST_FAILED; + + if (test_single_timeout(254, 2) != TEST_SUCCESS) + return TEST_FAILED; + + if (test_periodical_timer(10000, 500, 2) != TEST_SUCCESS) + return TEST_FAILED; + + if (test_periodical_timer(1234567, 12345, 100000) != TEST_SUCCESS) + return TEST_FAILED; + + if (test_cancel_timer() != TEST_SUCCESS) + return TEST_FAILED; + + if (test_rand_timers(1000, 100000, 100000000) != TEST_SUCCESS) + return TEST_FAILED; + + if (test_rand_timers(100000, 100000, 1000000) != TEST_SUCCESS) + return TEST_FAILED; + + if (test_next_timeout() != TEST_SUCCESS) + return TEST_FAILED; + + return TEST_SUCCESS; +} + +REGISTER_TEST_COMMAND(htw_autotest, test_htw); diff --git a/app/test/test_htw_perf.c b/app/test/test_htw_perf.c new file mode 100644 index 0000000000..65901f0874 --- /dev/null +++ b/app/test/test_htw_perf.c @@ -0,0 +1,181 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2023 Ericsson AB + */ + +#include "test.h" + +#include +#include +#include + +#include +#include +#include +#include + +static void +nop_cb(struct rte_htimer *timer __rte_unused, void *arg __rte_unused) +{ +} + +static void +add_rand_timers(struct rte_htw *htw, struct rte_htimer *timers, + uint64_t num, uint64_t timeout_start, uint64_t max_timeout) +{ + uint64_t i; + uint64_t expiration_times[num]; + uint64_t start_ts; + uint64_t end_ts; + + for (i = 0; i < num; i++) + expiration_times[i] = timeout_start + rte_rand_max(max_timeout); + + start_ts = rte_get_tsc_cycles(); + + for (i = 0; i < num; i++) { + struct rte_htimer *timer = &timers[i]; + + rte_htw_add(htw, timer, expiration_times[i], 0, nop_cb, NULL, + RTE_HTIMER_FLAG_ABSOLUTE_TIME); + } + + /* actually install the timers */ + rte_htw_process(htw); + + end_ts = rte_get_tsc_cycles(); + + printf(" %.0f TSC cycles / add op\n", + (double)(end_ts - start_ts) / num); +} + +#define TIME_STEP 16 + +static int +test_add_manage_perf(const char *scenario_name, uint64_t num_timers, + uint64_t timespan) +{ + uint64_t manage_calls; + struct rte_htimer *timers; + uint64_t start; + uint64_t now; + struct rte_htw *htw; + uint64_t start_ts; + uint64_t end_ts; + double latency; + + manage_calls = timespan / TIME_STEP; + + printf("Scenario: %s\n", scenario_name); + printf(" Configuration:\n"); + printf(" Timers: %"PRIu64"\n", num_timers); + printf(" Max timeout: %"PRIu64" ticks\n", timespan); + printf(" Average timeouts/manage call: %.3f\n", + num_timers / (double)manage_calls); + printf(" Time advance per manage call: %d\n", TIME_STEP); + + printf(" Results:\n"); + + timers = rte_malloc(NULL, sizeof(struct rte_htimer) * + num_timers, 0); + + if (timers == NULL) + rte_panic("Unable to allocate memory\n"); + + htw = rte_htw_create(); + + if (htw == NULL) + return TEST_FAILED; + + start = 1 + rte_rand_max(UINT64_MAX / 2); + + rte_htw_manage(htw, start - 1); + + add_rand_timers(htw, timers, num_timers, start, timespan); + + start_ts = rte_get_tsc_cycles(); + + for (now = start; now < (start + timespan); now += TIME_STEP) + rte_htw_manage(htw, now); + + end_ts = rte_get_tsc_cycles(); + + latency = end_ts - start_ts; + + printf(" %.0f TSC cycles / manage call\n", + latency / manage_calls); + printf(" %.1f TSC cycles / tick\n", latency / timespan); + + rte_htw_destroy(htw); + + rte_free(timers); + + return TEST_SUCCESS; +} + +static int +test_cancel_perf(uint64_t num_timers, uint64_t timespan) +{ + struct rte_htimer *timers; + uint64_t start; + struct rte_htw *htw; + uint64_t i; + uint64_t start_ts; + uint64_t end_ts; + double latency; + + timers = rte_malloc(NULL, sizeof(struct rte_htimer) * num_timers, 0); + + if (timers == NULL) + rte_panic("Unable to allocate memory\n"); + + htw = rte_htw_create(); + + if (htw == NULL) + return TEST_FAILED; + + start = 1 + rte_rand_max(UINT64_MAX / 2); + + rte_htw_manage(htw, start - 1); + + add_rand_timers(htw, timers, num_timers, start, timespan); + + start_ts = rte_get_tsc_cycles(); + + for (i = 0; i < num_timers; i++) + rte_htw_cancel(htw, &timers[i]); + + end_ts = rte_get_tsc_cycles(); + + latency = end_ts - start_ts; + + printf("Timer delete: %.0f TSC cycles / call\n", + latency / num_timers); + + rte_htw_destroy(htw); + + rte_free(timers); + + return TEST_SUCCESS; +} + +static int +test_htw_perf(void) +{ + rte_delay_us_block(100); + + if (test_add_manage_perf("Sparse", 100000, 10000000) != TEST_SUCCESS) + return TEST_FAILED; + + if (test_add_manage_perf("Dense", 100000, 200000) != TEST_SUCCESS) + return TEST_FAILED; + + if (test_add_manage_perf("Idle", 10, 100000) != TEST_SUCCESS) + return TEST_FAILED; + + if (test_cancel_perf(100000, 100000) != TEST_SUCCESS) + return TEST_FAILED; + + return TEST_SUCCESS; +} + +REGISTER_TEST_COMMAND(htw_perf_autotest, test_htw_perf); diff --git a/app/test/test_timer_htimer_htw_perf.c b/app/test/test_timer_htimer_htw_perf.c new file mode 100644 index 0000000000..e51fc7282f --- /dev/null +++ b/app/test/test_timer_htimer_htw_perf.c @@ -0,0 +1,693 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2023 Ericsson AB + */ + +#include "test.h" + +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include + +static uint64_t +s_to_tsc(double s) +{ + return s * rte_get_tsc_hz(); +} + +static double +tsc_to_s(uint64_t tsc) +{ + return (double)tsc / (double)rte_get_tsc_hz(); +} + +struct timer_conf { + uint64_t start; + uint64_t interval; +}; + +static void +get_timer_confs(double aggregate_expiration_rate, + struct timer_conf *timer_confs, + size_t num_timers) +{ + double avg_expiration_rate; + size_t i; + + avg_expiration_rate = aggregate_expiration_rate / num_timers; + + for (i = 0; i < num_timers; i++) { + struct timer_conf *conf = &timer_confs[i]; + double expiration_rate; + + expiration_rate = avg_expiration_rate * (rte_drand() + 0.5); + + conf->interval = rte_get_tsc_hz() / expiration_rate; + conf->start = rte_rand_max(conf->interval); + } +} + +struct timer_lib_ops { + const char *name; + + void *(*create)(const struct timer_conf *timer_confs, + size_t num_timers, bool cancel, uint64_t *fired); + void (*manage_time)(void *data, uint64_t current_time); + void (*manage)(void *data); + void (*destroy)(void *data); +}; + +static void * +nop_create(const struct timer_conf *timer_confs __rte_unused, + size_t num_timers __rte_unused, bool cancel __rte_unused, + uint64_t *fired __rte_unused) +{ + return NULL; +} + +static __rte_noinline void +nop_manage(void *data __rte_unused) +{ +} + +static __rte_noinline void +nop_manage_time(void *data __rte_unused, uint64_t current_time __rte_unused) +{ +} + +static void +nop_destroy(void *data __rte_unused) +{ +} + +static struct timer_lib_ops nop_ops = { + .name = "nop", + .create = nop_create, + .manage = nop_manage, + .manage_time = nop_manage_time, + .destroy = nop_destroy +}; + +struct ctimer { + uint64_t interval; + struct rte_timer timer; + uint64_t cancel_offset; + struct rte_timer canceled_timer; +}; + +static void +crash_cb(struct rte_timer *timer __rte_unused, void *cb_arg __rte_unused) +{ + abort(); +} + +#define CANCELED_OFFSET (0.5) /* s */ + +static void +test_cb(struct rte_timer *timer, void *cb_arg) +{ + struct ctimer *ctimer = + container_of(timer, struct ctimer, timer); + uint64_t *fired = cb_arg; + + rte_timer_reset(timer, ctimer->interval, SINGLE, + rte_lcore_id(), test_cb, cb_arg); + + if (ctimer->cancel_offset > 0) + rte_timer_reset(&ctimer->canceled_timer, + ctimer->interval + ctimer->cancel_offset, + SINGLE, rte_lcore_id(), crash_cb, NULL); + + (*fired)++; +} + +static void * +timer_create1(const struct timer_conf *timer_confs, size_t num_timers, + bool cancel, uint64_t *fired) +{ + struct ctimer *ctimers; + unsigned int i; + + ctimers = rte_malloc(NULL, sizeof(struct ctimer) * num_timers, 0); + + if (num_timers > 0 && ctimers == NULL) + rte_panic("Unable to allocate memory\n"); + + rte_timer_subsystem_init(); + + for (i = 0; i < num_timers; i++) { + const struct timer_conf *timer_conf = &timer_confs[i]; + struct ctimer *ctimer = &ctimers[i]; + struct rte_timer *timer = &ctimer->timer; + + rte_timer_init(timer); + + ctimer->interval = timer_conf->interval; + + rte_timer_reset(timer, timer_conf->start, SINGLE, + rte_lcore_id(), test_cb, fired); + + if (cancel) { + ctimer->cancel_offset = s_to_tsc(CANCELED_OFFSET); + + rte_timer_reset(&ctimer->canceled_timer, + timer_conf->start + ctimer->cancel_offset, + SINGLE, rte_lcore_id(), + crash_cb, NULL); + } else + ctimer->cancel_offset = 0; + } + + return ctimers; +} + +static void +timer_manage(void *data __rte_unused) +{ + rte_timer_manage(); +} + +static void +timer_manage_time(void *data __rte_unused, uint64_t current_time __rte_unused) +{ + rte_timer_manage(); +} + +static void +timer_destroy(void *data) +{ + rte_free(data); + + rte_timer_subsystem_finalize(); +} + +static struct timer_lib_ops timer_ops = { + .name = "timer", + .create = timer_create1, + .manage = timer_manage, + .manage_time = timer_manage_time, + .destroy = timer_destroy +}; + +struct chtimer { + uint64_t interval; + struct rte_htimer htimer; + uint64_t cancel_offset; + struct rte_htimer canceled_htimer; +}; + +static void +hcrash_cb(struct rte_htimer *timer __rte_unused, void *cb_arg __rte_unused) +{ + abort(); +} + +static void +htest_cb(struct rte_htimer *timer, void *cb_arg) +{ + struct chtimer *chtimer = + container_of(timer, struct chtimer, htimer); + uint64_t *fired = cb_arg; + + rte_htimer_mgr_add(timer, chtimer->interval, 0, htest_cb, cb_arg, + RTE_HTIMER_FLAG_TIME_TSC); + + if (chtimer->cancel_offset > 0) { + struct rte_htimer *canceled_htimer = + &chtimer->canceled_htimer; + uint64_t cancel_expiration_time = chtimer->interval + + chtimer->cancel_offset; + + rte_htimer_mgr_cancel(canceled_htimer); + + rte_htimer_mgr_add(canceled_htimer, cancel_expiration_time, 0, + hcrash_cb, NULL, RTE_HTIMER_FLAG_TIME_TSC); + } + + (*fired)++; +} + +#define TICK_LENGTH (1e-6) + +static void * +htimer_create(const struct timer_conf *timer_confs, size_t num_timers, + bool cancel, uint64_t *fired) +{ + struct chtimer *chtimers; + unsigned int i; + + chtimers = rte_malloc(NULL, sizeof(struct chtimer) * num_timers, 0); + + if (num_timers > 0 && chtimers == NULL) + rte_panic("Unable to allocate memory\n"); + + rte_htimer_mgr_init(TICK_LENGTH * NS_PER_S); + + rte_htimer_mgr_manage(); + + for (i = 0; i < num_timers; i++) { + const struct timer_conf *timer_conf = &timer_confs[i]; + struct chtimer *chtimer = &chtimers[i]; + + chtimer->interval = timer_conf->interval; + + rte_htimer_mgr_add(&chtimer->htimer, timer_conf->start, 0, + htest_cb, fired, RTE_HTIMER_FLAG_TIME_TSC); + + if (cancel) { + uint64_t cancel_start; + + chtimer->cancel_offset = s_to_tsc(CANCELED_OFFSET); + + cancel_start = + timer_conf->start + chtimer->cancel_offset; + + rte_htimer_mgr_add(&chtimer->canceled_htimer, + cancel_start, 0, + hcrash_cb, NULL, + RTE_HTIMER_FLAG_TIME_TSC); + } else + chtimer->cancel_offset = 0; + } + + rte_htimer_mgr_process(); + + return chtimers; +} + +static void +htimer_manage(void *data __rte_unused) +{ + rte_htimer_mgr_manage(); +} + +static void +htimer_manage_time(void *data __rte_unused, uint64_t current_time) +{ + rte_htimer_mgr_manage_time(current_time, RTE_HTIMER_FLAG_TIME_TSC); +} + +static void +htimer_destroy(void *data) +{ + rte_free(data); + + rte_htimer_mgr_deinit(); +} + +static struct timer_lib_ops htimer_ops = { + .name = "htimer", + .create = htimer_create, + .manage = htimer_manage, + .manage_time = htimer_manage_time, + .destroy = htimer_destroy, +}; + +struct htw { + struct rte_htw *htw; + struct chtimer *chtimers; + uint64_t tsc_per_tick; + uint64_t *fired; +}; + +static void +htw_manage_time(void *timer_data, uint64_t current_time) +{ + struct htw *htw = timer_data; + uint64_t tick; + + tick = current_time / htw->tsc_per_tick; + + rte_htw_manage(htw->htw, tick); +} + +static void +htw_manage(void *timer_data) +{ + uint64_t now; + + now = rte_get_tsc_cycles(); + + htw_manage_time(timer_data, now); +} + +static void +htwcrash_cb(struct rte_htimer *timer __rte_unused, void *cb_arg __rte_unused) +{ + abort(); +} + +static void +htwtest_cb(struct rte_htimer *timer, void *cb_arg) +{ + struct chtimer *chtimer = + container_of(timer, struct chtimer, htimer); + struct htw *htw = cb_arg; + + rte_htw_add(htw->htw, timer, chtimer->interval, 0, htwtest_cb, + cb_arg, 0); + + if (chtimer->cancel_offset > 0) { + struct rte_htimer *canceled_htimer = + &chtimer->canceled_htimer; + uint64_t cancel_expiration_time = chtimer->interval + + chtimer->cancel_offset; + + rte_htw_cancel(htw->htw, canceled_htimer); + + rte_htw_add(htw->htw, canceled_htimer, + cancel_expiration_time, 0, + htwcrash_cb, cb_arg, 0); + } + + (*htw->fired)++; +} + +static void * +htw_create(const struct timer_conf *timer_confs, size_t num_timers, + bool cancel, uint64_t *fired) +{ + unsigned int i; + struct htw *htw; + + htw = rte_malloc(NULL, sizeof(struct htw), 0); + if (htw == NULL) + rte_panic("Unable to allocate memory\n"); + + htw->htw = rte_htw_create(); + if (htw == NULL) + rte_panic("Unable to create HTW\n"); + + htw->chtimers = + rte_malloc(NULL, sizeof(struct chtimer) * num_timers, 0); + if (num_timers > 0 && htw->chtimers == NULL) + rte_panic("Unable to allocate memory\n"); + + htw->tsc_per_tick = s_to_tsc(TICK_LENGTH); + + htw->fired = fired; + + htw_manage(htw); + + for (i = 0; i < num_timers; i++) { + const struct timer_conf *timer_conf = &timer_confs[i]; + struct chtimer *chtimer = &htw->chtimers[i]; + uint64_t start; + + chtimer->interval = timer_conf->interval / htw->tsc_per_tick; + + start = timer_conf->start / htw->tsc_per_tick; + + rte_htw_add(htw->htw, &chtimer->htimer, + start, 0, htwtest_cb, htw, 0); + + if (cancel) { + uint64_t cancel_start; + + chtimer->cancel_offset = + s_to_tsc(CANCELED_OFFSET) / htw->tsc_per_tick; + + cancel_start = start + chtimer->cancel_offset; + + rte_htw_add(htw->htw, &chtimer->canceled_htimer, + cancel_start, 0, htwcrash_cb, NULL, 0); + } else + chtimer->cancel_offset = 0; + } + + rte_htw_process(htw->htw); + + return htw; +} + +static void +htw_destroy(void *data) +{ + struct htw *htw = data; + + rte_htw_destroy(htw->htw); + + rte_free(htw->chtimers); + + rte_free(htw); +} + +static struct timer_lib_ops htw_ops = { + .name = "htw", + .create = htw_create, + .manage = htw_manage, + .manage_time = htw_manage_time, + .destroy = htw_destroy, +}; + +static const struct timer_lib_ops *lib_ops[] = { + &timer_ops, &htimer_ops, &htw_ops +}; + +#define DUMMY_TASK_SIZE (2500) + +static __rte_noinline uint64_t +do_dummy_task(void) +{ + uint64_t result = 0; + unsigned int i; + + for (i = 0; i < DUMMY_TASK_SIZE; i++) + result += rte_rand(); + + return result; +} + +struct work_log { + uint64_t tasks_completed; + uint64_t runtime; +}; + +#define TARGET_RUNTIME (4.0) /* s */ + +struct run_result { + uint64_t tasks_completed; + uint64_t timer_fired; + uint64_t latency; +}; + +static void +run_with_lib(const struct timer_lib_ops *timer_ops, + const struct timer_conf *timer_confs, size_t num_timers, + bool cancel, struct run_result *result) +{ + void *timer_data; + uint64_t deadline; + uint64_t start; + uint64_t now; + volatile uint64_t sum = 0; + + result->tasks_completed = 0; + result->timer_fired = 0; + + timer_data = timer_ops->create(timer_confs, num_timers, cancel, + &result->timer_fired); + + start = rte_get_tsc_cycles(); + + deadline = start + s_to_tsc(TARGET_RUNTIME); + + do { + sum += do_dummy_task(); + + result->tasks_completed++; + + now = rte_get_tsc_cycles(); + + timer_ops->manage_time(timer_data, now); + } while (now < deadline); + + RTE_VERIFY(sum != 0); + + result->latency = rte_get_tsc_cycles() - start; + + timer_ops->destroy(timer_data); +} + +static void +benchmark_timer_libs(double aggregate_expiration_rate, uint64_t num_timers, + bool cancel) +{ + struct timer_conf timer_confs[num_timers]; + struct run_result nop_result; + double nop_per_task_latency; + struct run_result lib_results[RTE_DIM(lib_ops)]; + uint64_t lib_overhead[RTE_DIM(lib_ops)]; + + unsigned int i; + + printf("Configuration:\n"); + printf(" Aggregate timer expiration rate: %.3e Hz\n", + aggregate_expiration_rate); + if (cancel) + printf(" Aggregate timer cancellation rate: %.3e Hz\n", + aggregate_expiration_rate); + printf(" Concurrent timers: %zd\n", num_timers); + printf(" Tick length: %.1e s\n", TICK_LENGTH); + + rte_srand(4711); + + get_timer_confs(aggregate_expiration_rate, timer_confs, num_timers); + + run_with_lib(&nop_ops, NULL, 0, false, &nop_result); + nop_per_task_latency = + (double)nop_result.latency / nop_result.tasks_completed; + + for (i = 0; i < RTE_DIM(lib_ops); i++) { + struct run_result *lib_result = &lib_results[i]; + double per_task_latency; + + run_with_lib(lib_ops[i], timer_confs, num_timers, cancel, + lib_result); + + per_task_latency = (double)lib_result->latency / + lib_result->tasks_completed; + + if (per_task_latency > nop_per_task_latency) + lib_overhead[i] = + (per_task_latency - nop_per_task_latency) * + lib_result->tasks_completed; + else + lib_overhead[i] = 0; + } + + printf("Results:\n"); + + printf(" Work between manage calls: %.0f TSC cycles\n", + (double)nop_result.latency / nop_result.tasks_completed); + + printf("\n"); + printf("%-24s", ""); + for (i = 0; i < RTE_DIM(lib_ops); i++) + printf("%12s", lib_ops[i]->name); + printf("\n"); + + printf("%-24s", " Runtime [s]"); + for (i = 0; i < RTE_DIM(lib_ops); i++) + printf("%12.3e", tsc_to_s(lib_results[i].latency)); + printf("\n"); + + printf("%-24s", " Expiration rate [Hz]"); + for (i = 0; i < RTE_DIM(lib_ops); i++) + printf("%12.3e", lib_results[i].timer_fired / + tsc_to_s(lib_results[i].latency)); + printf("\n"); + + printf("%-24s", " Overhead [%%]"); + for (i = 0; i < RTE_DIM(lib_ops); i++) + printf("%12.3f", 100 * (double)lib_overhead[i] / + (double)lib_results[i].latency); + printf("\n"); + + printf("%-24s", " Per expiration [TSC]"); + for (i = 0; i < RTE_DIM(lib_ops); i++) + printf("%12"PRIu64, lib_overhead[i] / + lib_results[i].timer_fired); + printf("\n"); + + printf("%-24s", " Per manage() [TSC]"); + for (i = 0; i < RTE_DIM(lib_ops); i++) + printf("%12"PRIu64, lib_overhead[i] / + lib_results[i].tasks_completed); + printf("\n"); +} + +static void +benchmark_timer_libs_mode(double aggregate_expiration_rate, bool cancel) +{ + benchmark_timer_libs(aggregate_expiration_rate, 100, cancel); + benchmark_timer_libs(aggregate_expiration_rate, 100000, cancel); +} + +static void +benchmark_timer_libs_rate(double aggregate_expiration_rate) +{ + benchmark_timer_libs_mode(aggregate_expiration_rate, false); + benchmark_timer_libs_mode(aggregate_expiration_rate, true); +} + +#define MANAGE_ITERATIONS (10000000) + +static uint64_t +run_manage(const struct timer_lib_ops *timer_ops, bool user_provided_time) +{ + uint64_t start; + uint64_t latency; + void *timer_data; + + timer_data = timer_ops->create(NULL, 0, NULL, false); + + start = rte_get_tsc_cycles(); + + unsigned int i; + for (i = 0; i < MANAGE_ITERATIONS; i++) + if (user_provided_time && timer_ops->manage_time != NULL) { + uint64_t now; + + now = rte_get_tsc_cycles(); + + timer_ops->manage_time(timer_data, now); + } else + timer_ops->manage(timer_data); + + latency = rte_get_tsc_cycles() - start; + + timer_ops->destroy(timer_data); + + return latency / MANAGE_ITERATIONS; +} + +static void +benchmark_timer_libs_timeless_manage(bool user_provided_time) +{ + unsigned int i; + uint64_t nop_latency; + + nop_latency = run_manage(&nop_ops, user_provided_time); + + printf("Zero-timers manage() overhead%s:\n", user_provided_time ? + " (w/ user-provided time)" : ""); + + for (i = 0; i < RTE_DIM(lib_ops); i++) { + const struct timer_lib_ops *ops = lib_ops[i]; + uint64_t latency; + + latency = run_manage(ops, user_provided_time); + + if (latency > nop_latency) + latency -= nop_latency; + else + latency = 0; + + printf(" %s: %"PRIu64" TSC cycles\n", ops->name, latency); + } +} + +static int +test_timer_htimer_htw_perf(void) +{ + /* warm up */ + rte_delay_us_block(10000); + + benchmark_timer_libs_rate(1e6); + + benchmark_timer_libs_timeless_manage(false); + benchmark_timer_libs_timeless_manage(true); + + return TEST_SUCCESS; +} + +REGISTER_TEST_COMMAND(timer_htimer_htw_perf_autotest, + test_timer_htimer_htw_perf); diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md index 2deec7ea19..5ea1dfa262 100644 --- a/doc/api/doxy-api-index.md +++ b/doc/api/doxy-api-index.md @@ -67,6 +67,8 @@ The public API headers are grouped by topics: - **timers**: [cycles](@ref rte_cycles.h), [timer](@ref rte_timer.h), + [htimer_mgr](@ref rte_htimer_mgr.h), + [htimer](@ref rte_htimer.h), [alarm](@ref rte_alarm.h) - **locks**: @@ -163,7 +165,8 @@ The public API headers are grouped by topics: [ring](@ref rte_ring.h), [stack](@ref rte_stack.h), [tailq](@ref rte_tailq.h), - [bitmap](@ref rte_bitmap.h) + [bitmap](@ref rte_bitmap.h), + [bitset](@ref rte_bitset.h) - **packet framework**: * [port](@ref rte_port.h): diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in index e859426099..c0cd64db34 100644 --- a/doc/api/doxy-api.conf.in +++ b/doc/api/doxy-api.conf.in @@ -45,6 +45,7 @@ INPUT = @TOPDIR@/doc/api/doxy-api-index.md \ @TOPDIR@/lib/gro \ @TOPDIR@/lib/gso \ @TOPDIR@/lib/hash \ + @TOPDIR@/lib/htimer \ @TOPDIR@/lib/ip_frag \ @TOPDIR@/lib/ipsec \ @TOPDIR@/lib/jobstats \ diff --git a/lib/htimer/meson.build b/lib/htimer/meson.build new file mode 100644 index 0000000000..2dd5d6a24b --- /dev/null +++ b/lib/htimer/meson.build @@ -0,0 +1,7 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2023 Ericsson AB + +sources = files('rte_htw.c', 'rte_htimer_msg_ring.c', 'rte_htimer_mgr.c') +headers = files('rte_htimer_mgr.h', 'rte_htimer.h') + +deps += ['ring'] diff --git a/lib/htimer/rte_htimer.h b/lib/htimer/rte_htimer.h new file mode 100644 index 0000000000..6ac86292b5 --- /dev/null +++ b/lib/htimer/rte_htimer.h @@ -0,0 +1,68 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2023 Ericsson AB + */ + +#ifndef _RTE_HTIMER_H_ +#define _RTE_HTIMER_H_ + +#include +#include +#include + +#include + +struct rte_htimer; + +typedef void (*rte_htimer_cb_t)(struct rte_htimer *, void *); + +struct rte_htimer { + /** + * Absolute timer expiration time (in ticks). + */ + uint64_t expiration_time; + /** + * Time between expirations (in ticks). Zero for one-shot timers. + */ + uint64_t period; + /** + * Owning lcore. May safely be read from any thread. + */ + uint32_t owner_lcore_id; + /** + * The current state of the timer. + */ + uint32_t state:4; + /** + * Flags set on this timer. + */ + uint32_t flags:28; + /** + * User-specified callback function pointer. + */ + rte_htimer_cb_t cb; + /** + * Argument for user callback. + */ + void *cb_arg; + /** + * Pointers used to add timer to various internal lists. + */ + LIST_ENTRY(rte_htimer) entry; +}; + +#define RTE_HTIMER_FLAG_ABSOLUTE_TIME RTE_BIT32(0) +#define RTE_HTIMER_FLAG_PERIODICAL RTE_BIT32(1) +#define RTE_HTIMER_FLAG_TIME_TICK RTE_BIT32(2) +#define RTE_HTIMER_FLAG_TIME_TSC RTE_BIT32(3) + +#define RTE_HTIMER_STATE_PENDING 1 +#define RTE_HTIMER_STATE_EXPIRED 2 +#define RTE_HTIMER_STATE_CANCELED 3 + +LIST_HEAD(rte_htimer_list, rte_htimer); + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_HTIMER_H_ */ diff --git a/lib/htimer/rte_htimer_mgr.c b/lib/htimer/rte_htimer_mgr.c new file mode 100644 index 0000000000..efdfcf0985 --- /dev/null +++ b/lib/htimer/rte_htimer_mgr.c @@ -0,0 +1,547 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2023 Ericsson AB + */ + +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include + +#include "rte_htimer_mgr.h" +#include "rte_htimer_msg.h" +#include "rte_htimer_msg_ring.h" + +#define MAX_MSG_BATCH_SIZE 16 + +struct htimer_mgr { + struct rte_htimer_msg_ring *msg_ring; + struct rte_htw *htw; + + unsigned int async_msgs_idx __rte_cache_aligned; + unsigned int num_async_msgs; + struct rte_htimer_msg async_msgs[MAX_MSG_BATCH_SIZE]; +} __rte_cache_aligned; + +static uint64_t ns_per_tick; +static double tsc_per_tick; + +static struct htimer_mgr mgrs[RTE_MAX_LCORE + 1]; + +#define MAX_ASYNC_TRANSACTIONS 1024 +#define MSG_RING_SIZE MAX_ASYNC_TRANSACTIONS + +static inline uint64_t +tsc_to_tick(uint64_t tsc) +{ + return tsc / tsc_per_tick; +} + +static inline uint64_t +tsc_to_tick_round_up(uint64_t tsc) +{ + uint64_t tick; + + tick = (tsc + tsc_per_tick / 2) / tsc_per_tick; + + return tick; +} + +static inline uint64_t +ns_to_tick(uint64_t ns) +{ + return ns / ns_per_tick; +} + +static inline uint64_t +ns_to_tick_round_up(uint64_t ns) +{ + uint64_t tick; + + tick = ceil(ns / ns_per_tick); + + return tick; +} + +static inline uint64_t +tick_to_ns(uint64_t tick) +{ + return tick * ns_per_tick; +} + +static struct htimer_mgr * +mgr_get(unsigned int lcore_id) +{ + return &mgrs[lcore_id]; +} + +static int +mgr_init(unsigned int lcore_id) +{ + char ring_name[RTE_RING_NAMESIZE]; + unsigned int socket_id; + struct htimer_mgr *mgr = &mgrs[lcore_id]; + + socket_id = rte_lcore_to_socket_id(lcore_id); + + snprintf(ring_name, sizeof(ring_name), "htimer_%d", lcore_id); + + mgr->msg_ring = + rte_htimer_msg_ring_create(ring_name, MSG_RING_SIZE, socket_id, + RING_F_SC_DEQ); + + if (mgr->msg_ring == NULL) + goto err; + + mgr->htw = rte_htw_create(); + + if (mgr->htw == NULL) + goto err_free_ring; + + mgr->async_msgs_idx = 0; + mgr->num_async_msgs = 0; + + return 0; + +err_free_ring: + rte_htimer_msg_ring_free(mgr->msg_ring); +err: + return -ENOMEM; +} + +static void +mgr_deinit(unsigned int lcore_id) +{ + struct htimer_mgr *mgr = &mgrs[lcore_id]; + + rte_htw_destroy(mgr->htw); + + rte_htimer_msg_ring_free(mgr->msg_ring); +} + +static volatile bool initialized; + +static void +assure_initialized(void) +{ + RTE_ASSERT(initialized); +} + +int +rte_htimer_mgr_init(uint64_t _ns_per_tick) +{ + unsigned int lcore_id; + + RTE_VERIFY(!initialized); + + ns_per_tick = _ns_per_tick; + + tsc_per_tick = (ns_per_tick / 1e9) * rte_get_tsc_hz(); + + for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) { + int rc; + + rc = mgr_init(lcore_id); + + if (rc < 0) { + unsigned int deinit_lcore_id; + + for (deinit_lcore_id = 0; deinit_lcore_id < lcore_id; + deinit_lcore_id++) + mgr_deinit(deinit_lcore_id); + + return rc; + } + } + + initialized = true; + + return 0; +} + +void +rte_htimer_mgr_deinit(void) +{ + unsigned int lcore_id; + + assure_initialized(); + + for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) + mgr_deinit(lcore_id); + + initialized = false; +} + +static void +assure_valid_time_conversion_flags(uint32_t flags __rte_unused) +{ + RTE_ASSERT(!((flags & RTE_HTIMER_FLAG_TIME_TSC) && + (flags & RTE_HTIMER_FLAG_TIME_TICK))); +} + +static void +assure_valid_add_flags(uint32_t flags) +{ + assure_valid_time_conversion_flags(flags); + + RTE_ASSERT(!(flags & ~(RTE_HTIMER_FLAG_PERIODICAL | + RTE_HTIMER_FLAG_ABSOLUTE_TIME | + RTE_HTIMER_FLAG_TIME_TSC | + RTE_HTIMER_FLAG_TIME_TICK))); +} + +static uint64_t +convert_time(uint64_t t, uint32_t flags) +{ + if (flags & RTE_HTIMER_FLAG_TIME_TSC) + return tsc_to_tick(t); + else if (flags & RTE_HTIMER_FLAG_TIME_TICK) + return t; + else + return ns_to_tick(t); +} + +void +rte_htimer_mgr_add(struct rte_htimer *timer, uint64_t expiration_time, + uint64_t period, rte_htimer_cb_t timer_cb, + void *timer_cb_arg, uint32_t flags) +{ + unsigned int lcore_id = rte_lcore_id(); + struct htimer_mgr *mgr = mgr_get(lcore_id); + uint64_t expiration_time_tick; + uint64_t period_tick; + + assure_initialized(); + + assure_valid_add_flags(flags); + + expiration_time_tick = convert_time(expiration_time, flags); + + period_tick = convert_time(period, flags); + + rte_htw_add(mgr->htw, timer, expiration_time_tick, period_tick, + timer_cb, timer_cb_arg, flags); + + timer->owner_lcore_id = lcore_id; +} + +int +rte_htimer_mgr_cancel(struct rte_htimer *timer) +{ + unsigned int lcore_id = rte_lcore_id(); + struct htimer_mgr *mgr = mgr_get(lcore_id); + + assure_initialized(); + + RTE_ASSERT(timer->owner_lcore_id == lcore_id); + + switch (timer->state) { + case RTE_HTIMER_STATE_PENDING: + rte_htw_cancel(mgr->htw, timer); + return 0; + case RTE_HTIMER_STATE_EXPIRED: + return -ETIME; + default: + RTE_ASSERT(timer->state == RTE_HTIMER_STATE_CANCELED); + return -ENOENT; + } +} + +static int +send_msg(unsigned int receiver_lcore_id, enum rte_htimer_msg_type msg_type, + struct rte_htimer *timer, rte_htimer_mgr_async_op_cb_t async_cb, + void *async_cb_arg, const struct rte_htimer_msg_request *request, + const struct rte_htimer_msg_response *response) +{ + struct htimer_mgr *receiver_mgr; + struct rte_htimer_msg_ring *receiver_ring; + struct rte_htimer_msg msg = (struct rte_htimer_msg) { + .msg_type = msg_type, + .timer = timer, + .async_cb = async_cb, + .async_cb_arg = async_cb_arg + }; + int rc; + + if (request != NULL) + msg.request = *request; + else + msg.response = *response; + + receiver_mgr = mgr_get(receiver_lcore_id); + + receiver_ring = receiver_mgr->msg_ring; + + rc = rte_htimer_msg_ring_enqueue(receiver_ring, &msg); + + return rc; +} + +static int +send_request(unsigned int receiver_lcore_id, enum rte_htimer_msg_type msg_type, + struct rte_htimer *timer, + rte_htimer_mgr_async_op_cb_t async_cb, void *async_cb_arg) +{ + unsigned int lcore_id = rte_lcore_id(); + struct rte_htimer_msg_request request = { + .source_lcore_id = lcore_id + }; + + return send_msg(receiver_lcore_id, msg_type, timer, async_cb, + async_cb_arg, &request, NULL); +} + +static int +send_response(unsigned int receiver_lcore_id, enum rte_htimer_msg_type msg_type, + struct rte_htimer *timer, + rte_htimer_mgr_async_op_cb_t async_cb, void *async_cb_arg, + int result) +{ + struct rte_htimer_msg_response response = { + .result = result + }; + + return send_msg(receiver_lcore_id, msg_type, timer, async_cb, + async_cb_arg, NULL, &response); +} + +int +rte_htimer_mgr_async_add(struct rte_htimer *timer, + unsigned int target_lcore_id, + uint64_t expiration_time, uint64_t period, + rte_htimer_cb_t timer_cb, void *timer_cb_arg, + uint32_t flags, + rte_htimer_mgr_async_op_cb_t async_cb, + void *async_cb_arg) +{ + *timer = (struct rte_htimer) { + .expiration_time = expiration_time, + .period = period, + .owner_lcore_id = target_lcore_id, + .flags = flags, + .cb = timer_cb, + .cb_arg = timer_cb_arg + }; + + assure_initialized(); + + if (send_request(target_lcore_id, rte_htimer_msg_type_add_request, + timer, async_cb, async_cb_arg) < 0) + return -EBUSY; + + return 0; +} + +int +rte_htimer_mgr_async_cancel(struct rte_htimer *timer, + rte_htimer_mgr_async_op_cb_t async_cb, + void *async_cb_arg) +{ + if (send_request(timer->owner_lcore_id, + rte_htimer_msg_type_cancel_request, + timer, async_cb, async_cb_arg) < 0) + return -EBUSY; + + return 0; +} + +static int +process_add_request(struct rte_htimer_msg *request) +{ + struct rte_htimer *timer = request->timer; + + if (request->async_cb != NULL && + send_response(request->request.source_lcore_id, + rte_htimer_msg_type_add_response, timer, + request->async_cb, request->async_cb_arg, + RTE_HTIMER_MGR_ASYNC_RESULT_ADDED) < 0) + return -EBUSY; + + rte_htimer_mgr_add(timer, timer->expiration_time, timer->period, + timer->cb, timer->cb_arg, timer->flags); + + return 0; +} + +static int +process_cancel_request(struct rte_htimer_msg *request) +{ + unsigned int lcore_id = rte_lcore_id(); + struct htimer_mgr *mgr = mgr_get(lcore_id); + struct rte_htimer *timer = request->timer; + int result; + + switch (timer->state) { + case RTE_HTIMER_STATE_PENDING: + result = RTE_HTIMER_MGR_ASYNC_RESULT_CANCELED; + break; + case RTE_HTIMER_STATE_CANCELED: + result = RTE_HTIMER_MGR_ASYNC_RESULT_ALREADY_CANCELED; + break; + case RTE_HTIMER_STATE_EXPIRED: + result = RTE_HTIMER_MGR_ASYNC_RESULT_EXPIRED; + break; + default: + RTE_ASSERT(0); + result = -1; + } + + if (request->async_cb != NULL && + send_response(request->request.source_lcore_id, + rte_htimer_msg_type_cancel_response, timer, + request->async_cb, request->async_cb_arg, + result) < 0) + return -EBUSY; + + if (timer->state == RTE_HTIMER_STATE_PENDING) + rte_htw_cancel(mgr->htw, timer); + + return 0; +} + +static int +process_response(struct rte_htimer_msg *msg) +{ + struct rte_htimer_msg_response *response = &msg->response; + + if (msg->async_cb != NULL) + msg->async_cb(msg->timer, response->result, msg->async_cb_arg); + + return 0; +} + +static int +process_msg(struct rte_htimer_msg *msg) +{ + switch (msg->msg_type) { + case rte_htimer_msg_type_add_request: + return process_add_request(msg); + case rte_htimer_msg_type_cancel_request: + return process_cancel_request(msg); + case rte_htimer_msg_type_add_response: + case rte_htimer_msg_type_cancel_response: + return process_response(msg); + default: + RTE_ASSERT(0); + return -EBUSY; + } +} + +static void +dequeue_async_msgs(struct htimer_mgr *mgr) +{ + unsigned int i; + + if (likely(rte_htimer_msg_ring_empty(mgr->msg_ring))) + return; + + if (unlikely(mgr->num_async_msgs > 0)) + return; + + mgr->async_msgs_idx = 0; + + mgr->num_async_msgs = + rte_htimer_msg_ring_dequeue_burst(mgr->msg_ring, + mgr->async_msgs, + MAX_MSG_BATCH_SIZE); + + for (i = 0; i < mgr->num_async_msgs; i++) + rte_prefetch1(mgr->async_msgs[i].timer); +} + +static void +process_async(struct htimer_mgr *mgr) +{ + for (;;) { + struct rte_htimer_msg *msg; + + dequeue_async_msgs(mgr); + + if (mgr->num_async_msgs == 0) + break; + + msg = &mgr->async_msgs[mgr->async_msgs_idx]; + + if (process_msg(msg) < 0) + break; + + mgr->num_async_msgs--; + mgr->async_msgs_idx++; + } +} + +static __rte_always_inline void +htimer_mgr_manage_time(uint64_t current_time, uint32_t flags) +{ + unsigned int lcore_id = rte_lcore_id(); + struct htimer_mgr *mgr = mgr_get(lcore_id); + uint64_t current_tick; + + assure_initialized(); + + assure_valid_time_conversion_flags(flags); + + process_async(mgr); + + current_tick = convert_time(current_time, flags); + + rte_htw_manage(mgr->htw, current_tick); +} + +void +rte_htimer_mgr_manage_time(uint64_t current_time, uint32_t flags) +{ + htimer_mgr_manage_time(current_time, flags); +} + +void +rte_htimer_mgr_manage(void) +{ + uint64_t current_time; + + current_time = rte_get_tsc_cycles(); + + htimer_mgr_manage_time(current_time, RTE_HTIMER_FLAG_TIME_TSC); +} + +void +rte_htimer_mgr_process(void) +{ + unsigned int lcore_id = rte_lcore_id(); + struct htimer_mgr *mgr = mgr_get(lcore_id); + + process_async(mgr); + assure_initialized(); + + rte_htw_process(mgr->htw); +} + +uint64_t +rte_htimer_mgr_current_time(void) +{ + uint64_t current_tick; + + current_tick = rte_htimer_mgr_current_tick(); + + return tick_to_ns(current_tick); +} + +uint64_t +rte_htimer_mgr_current_tick(void) +{ + unsigned int lcore_id = rte_lcore_id(); + struct htimer_mgr *mgr = mgr_get(lcore_id); + uint64_t current_tick; + + current_tick = rte_htw_current_time(mgr->htw); + + return current_tick; +} diff --git a/lib/htimer/rte_htimer_mgr.h b/lib/htimer/rte_htimer_mgr.h new file mode 100644 index 0000000000..173a95f9c0 --- /dev/null +++ b/lib/htimer/rte_htimer_mgr.h @@ -0,0 +1,516 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2023 Ericsson AB + */ + +#ifndef _RTE_HTIMER_MGR_H_ +#define _RTE_HTIMER_MGR_H_ + +/** + * @file + * + * RTE High-performance Timer Manager + * + * The high-performance timer manager (htimer_mgr) API provides access + * to a low-overhead, scalable timer service. + * + * The functionality offered similar to that of , but the + * internals differs significantly, and there are slight differences + * in the programming interface as well. + * + * Core timer management is implemented by means of a hierarchical + * timer wheel (HTW), as per the Varghese and Lauck paper Hashed + * and Hierarchical Timing Wheels: Data Structures for the Efficient + * Implementation of a Timer Facility. + * + * Varghese et al's approach is further enhanced by the placement of a + * bitset in front of each wheel's slots. Each slot has a + * corresponding bit in the bitset. If a bit is clear, there are no + * pending timers scheduled for that slot. A set bit means there + * potentially are timers scheduled for that slot. This scheme reduces + * the overhead of the rte_htimer_mgr_manage() function, where slots + * of one or more of the wheels of the thread's HTW are scanned if + * time has progressed since last call. This improves performance is + * all cases, except for very densely populated timer wheels. + * + * One such HTW is instantiated for each lcore (EAL thread), and + * instances are also available for registered non-EAL threads. + * + * The API may not be called from unregistered + * non-EAL threads. + * + * The per-lcore-id HTW instance is private to that thread. + * + * The htimer API supports scheduling timers to a different thread + * (and thus, a different HTW) than the caller's. It is also possible + * to cancel timers managed by a "remote" timer wheel. + * + * All interaction (i.e., adding timers to or removing timers from) a + * remote HTW is done by sending a request, in the form of message on + * a DPDK ring, to that instance. Such requests are processed and, if + * required, acknowledged when the remote (target) thread calls + * rte_htimer_mgr_manage(), rte_htimer_mgr_manage_time() or + * rte_htimer_mgr_process(). + * + * This message-based interaction avoid comparatively heavy-weight + * synchronization primitives such as spinlocks. Only release-acquire + * type synchronization on the rings are needed. + * + * Timer memory management is the responsibility of the + * application. After library-level initialization has completed, no + * more dynamic memory is allocated by the htimer library. When + * installing timers on remote lcores, care must be taken by the + * application to avoid race conditions, in particular use-after-free + * (or use-after-recycle) issues of the rte_timer structure. A timer + * struct may only be deallocated and/or recycled if the application + * can guarantee that there are no cancel requests in flight. + * + * The htimer library is able to give a definitive answer to the + * question if a remote timer's had expired or not, at the time of + * cancellation. + * + * The htimer library uses TSC as the default time source. A different + * time source may be used, in which case the application must + * explicitly provide the time using rte_htimer_mgr_manage_time(). + * This function may also be used even if TSC is the time source, in + * cases where the application for some other purpose already is in + * possession of the current TSC time, to avoid the overhead of + * htimer's `rdtsc` instruction (or its equivalent on non-x86 ISAs). + * + * The htimer supports periodic and single-shot timers. + * + * The timer tick defines a quantum of time in the htimer library. The + * length of a tick (quantified in nanoseconds) is left to the + * application to specify. The core HTW implementation allows for all + * 64 bits to be used. + * + * Very fine-grained ticks increase the HTW overhead (since more slots + * needs to be scanned). Long ticks will only allow for very + * course-grained timers, and in timer-heavy application may cause + * load spikes when time advances into a new tick. + * + * Seemingly reasonable timer tick length range in between 100 ns and + * 100 us (or maybe up to as high as 1 ms), depending on the + * application. + */ + +#include + +#include +#include +#include + +/** + * The timer has been added to the timer manager on the target lcore. + */ +#define RTE_HTIMER_MGR_ASYNC_RESULT_ADDED 1 + +/** + * The timer cancellation request has completed, before the timer expired + * on the target lcore. + */ +#define RTE_HTIMER_MGR_ASYNC_RESULT_CANCELED 2 + +/** + * The timer cancellation request was denied, since the timer was + * already marked as canceled. + */ +#define RTE_HTIMER_MGR_ASYNC_RESULT_ALREADY_CANCELED 3 + +/** + * At the time of the cancellation request process on the target + * lcore, the timer had already expired. + */ +#define RTE_HTIMER_MGR_ASYNC_RESULT_EXPIRED 4 + +typedef void (*rte_htimer_mgr_async_op_cb_t)(struct rte_htimer *timer, + int result, void *cb_arg); + +/** + * Initialize the htimer library. + * + * Instantiates per-lcore (or per-registered non-EAL thread) timer + * wheels and other htimer library data structures, for all current + * and future threads. + * + * This function must be called prior to any other API + * call. + * + * This function may not be called if the htimer library is already + * initialized, but may be called multiple times, provided the library + * is deinitialized in between rte_htimer_mgr_init() calls. + * + * For applications not using TSC as the time source, the \c ns_per_tick + * parameter will denote the number of such application time-source-units + * per tick. + * + * This function is not multi-thread safe. + * + * @param ns_per_tick + * The length (in nanoseconds) of a timer wheel tick. + * + * @return + * - 0: Success + * - -ENOMEM: Unable to allocate memory needed to initialize timer + * subsystem + * + * @see rte_htimer_mgr_deinit() + * @see rte_get_tsc_hz() + */ + +__rte_experimental +int +rte_htimer_mgr_init(uint64_t ns_per_tick); + +/** + * Deinitialize the htimer library. + * + * This function deallocates all dynamic memory used by the library, + * including HTW instances used by other threads than the caller. + * + * After this call has been made, no API call may be + * made, except rte_htimer_mgr_init(). + * + * This function may not be called if the htimer library has never be + * initialized, or has been be deinitialized but not yet initialized + * again. + * + * This function is not multi-thread safe. In particular, no thread + * may call any functions (e.g., rte_htimer_mgr_manage()) + * during (or after) the htimer library is deinitialized, except if it + * is initialized again. + * + * @see rte_htimer_mgr_init() + */ + +__rte_experimental +void +rte_htimer_mgr_deinit(void); + +/** + * Adds a timer to the calling thread's timer wheel. + * + * This function schedules a timer on the calling thread's HTW. + * + * The \c timer_cb callback is called at a point when this thread + * calls rte_htimer_mgr_process(), rte_htimer_mgr_manage(), or + * rte_htimer_mgr_manage_time() and the expiration time has passed the + * current time (either as retrieved by rte_htimer_mgr_manage() or + * specified by the application in rte_htimer_mgr_manage_time(). + * + * The HTW trackes times in units of \c ticks, which are likely more + * coarse-grained than nanosecond and TSC resolution. + * + * By default, the \c expiration_time is interpreted as the number of + * the nanoseconds into the future the timer should expired, relative + * to the last known current time, rounded up to the nearest tick. + * Thus, a timer with a certain expiration time maybe not expire even + * though this time was supplied in rte_timer_manage_time(). The + * maximum error is the length of one tick (plus any delays caused by + * infrequent manage calls). + * + * If the \c RTE_HTIMER_FLAG_ABSOLUTE_TIME is set in \c flags, the + * expiration time is relative to time zero. + * + * If the \c RTE_HTIMER_FLAG_PERIODICAL flag is set, the timer is + * peridoical, and will expire first at the time specified by + * the \c expiration_time, and then with an interval as specified + * by the \c period parameter. + * + * An added timer may be canceled using rte_htimer_mgr_cancel() or + * rte_htimer_mgr_async_cancel(). + * + * rte_htimer_mgr_add() is multi-thread safe, and may only be called + * from an EAL thread or a registered non-EAL thread. + * + * @param timer + * The chunk of memory used for managing this timer. This memory + * must not be read or written (or free'd) by the application until + * this timer has expired, or any cancellation attempts have + * completed. + * @param expiration_time + * The expiration time (in nanoseconds by default). For periodical + * timers, this time represent the first expiration time. + * @param period + * The time in between periodic timer expirations (in nanoseconds by + * default). Must be set to zero unless the + * \c RTE_HTIMER_FLAG_PERIODICAL flag is set, in case it must be a + * positive integer. + * @param timer_cb + * The timer callback to be called upon timer expiration. + * @param timer_cb_arg + * A pointer which will be supplied back to the application in the + * timer callback call. + * @param flags + * A bitmask which may contain these flags: + * * \c RTE_HTIMER_FLAG_PERIODICAL + * * \c RTE_HTIMER_FLAG_ABSOLUTE_TIME + * * Either \c RTE_HTIMER_FLAG_TIME_TICK or \c RTE_HTIMER_FLAG_TIME_TSC + */ + +__rte_experimental +void +rte_htimer_mgr_add(struct rte_htimer *timer, uint64_t expiration_time, + uint64_t period, rte_htimer_cb_t timer_cb, + void *timer_cb_arg, uint32_t flags); + +/** + * Cancel a timer scheduled in the calling thread's timer wheel. + * + * This function cancel a timer scheduled on the calling thread's HTW. + * + * rte_htimer_mgr_cancel() may be called on a timer which has already + * (synchronously or asynchronously) been canceled, or may have expired. + * However, the \c rte_htimer struct pointed to by \c timer may not + * have been freed or recycled since. + * + * rte_htimer_mgr_cancel() may not be called for a timer that was + * never (or, not yet) added. + * + * A timer added using rte_htimer_mgr_async_add() may be not be + * canceled using this function until after the add operation has + * completed (i.e, the completion callback has been run). + * + * rte_htimer_mgr_cancel() is multi-thread safe, and may only be + * called from an EAL thread or a registered non-EAL thread. + * + * @param timer + * The timer to be canceled. + * @return + * - 0: Success + * - -ETIME: Timer has expired, and thus could not be canceled. + * - -ENOENT: Timer was already canceled. + */ + +__rte_experimental +int +rte_htimer_mgr_cancel(struct rte_htimer *timer); + +/** + * Asynchronuosly add a timer to the specified lcore's timer wheel. + * + * This function is the equivalent of rte_htimer_mgr_add(), but allows + * the calling ("source") thread to scheduled a timer in a HTW other + * than it's own. The operation is asynchronous. + * + * The timer works the same as a timer added locally. Thus, the \c + * timer_cb callback is called by the target thread, and it may be + * canceled using rte_htimer_mgr_cancel(). + * + * The source thread may be the same as the target thread. + * + * Only EAL threads or registered non-EAL thread may be targeted. + * + * A successful rte_htimer_mgr_async_add() call guarantees that the + * timer will be scheduled on the target lcore at some future time, + * provided the target thread calls either rte_htimer_mgr_process(), + * rte_htimer_mgr_manage(), and/or rte_htimer_mgr_manage_time(). + * + * The \c async_cb callback is called on the source thread as a part + * of its rte_htimer_mgr_process(), rte_htimer_mgr_manage(), or + * rte_htimer_mgr_manage_time() call, when the asynchronous add + * operation has completed (i.e., the timer is scheduled in the target + * HTW). + * + * \c async_cb may be NULL, in which case no notification is given. + * + * An asynchronously added timer may be asynchronously canceled (i.e., + * using rte_htimer_mgr_async_cancel()) at any point, by any thread, + * after the rte_htimer_mgr_async_add() call. A asynchronously added + * timer may be not be canceled using rte_htimer_mgr_cancel() until + * after the completion callback has been executed. + * + * rte_htimer_mgr_async_add() is multi-thread safe, and may only be called + * from an EAL thread or a registered non-EAL thread. + * + * @param timer + * The chunk of memory used for managing this timer. This memory + * must not be read or written (or free'd) by the application until + * this timer has expired, or any cancellation attempts have + * completed. + * @param target_lcore_id + * The lcore id of the thread which HTW will be manage this timer. + * @param expiration_time + * The expiration time (measured in nanoseconds). For periodical + * timers, this time represent the first expiration time. + * @param period + * The time in between periodic timer expirations (measured in + * nanoseconds). Must be set to zero unless the + * RTE_HTIMER_FLAG_PERIODICAL flag is set, in case it must be a + * positive integer. + * @param timer_cb + * The timer callback to be called upon timer expiration. + * @param timer_cb_arg + * A pointer which will be supplied back to the application in the + * timer callback call. + * @param async_cb + * The asynchronous operationg callback to be called when the + * add operation is completed. + * @param async_cb_arg + * A pointer which will be supplied back to the application in the + * \c async_cb callback call. + * @param flags + * RTE_HTIMER_FLAG_ABSOLUTE_TIME and/or RTE_HTIMER_FLAG_PERIODICAL. + * @return + * - 0: Success + * - -EBUSY: The maximum number of concurrently queued asynchronous + * operations has been reached. + */ + +__rte_experimental +int +rte_htimer_mgr_async_add(struct rte_htimer *timer, + unsigned int target_lcore_id, + uint64_t expiration_time, uint64_t period, + rte_htimer_cb_t timer_cb, void *timer_cb_arg, + uint32_t flags, + rte_htimer_mgr_async_op_cb_t async_cb, + void *async_cb_arg); + +/** + * Asynchronuosly cancel a timer in any thread's timer wheel. + * + * This function is the equivalent of rte_htimer_mgr_cancel(), but + * allows the calling ("source") thread to also cancel a timer in a + * HTW other than it's own. The operation is asynchronous. + * + * A thread may asynchronously cancel a timer scheduled on its own + * HTW. + * + * The \c async_cb callback is called on the source thread as a part + * of its rte_htimer_mgr_process(), rte_htimer_mgr_manage(), or + * rte_htimer_mgr_manage_time() call, when the asynchronous add + * operation has completed (i.e., the timer is scheduled in the target + * HTW). + * + * \c async_cb may be NULL, in which case no notification is given. + * + * A timer may be asynchronously canceled at any point, by any thread, + * after it has been either synchronously or asynchronously added. + * + * rte_htimer_mgr_async_cancel() is multi-thread safe, and may only be + * called from an EAL thread or a registered non-EAL thread. + * + * @param timer + * The memory used for managing this timer. This memory must not be + * read or written (or free'd) by the application until this timer + * has expired, or any cancellation attempts have completed. + * @param async_cb + * The asynchronous operationg callback to be called when the + * add operation is completed. + * @param async_cb_arg + * A pointer which will be supplied back to the application in the + * \c async_cb callback call. + * @return + * - 0: Success + * - -EBUSY: The maximum number of concurrently queued asynchronous + * operations has been reached. + */ + +__rte_experimental +int +rte_htimer_mgr_async_cancel(struct rte_htimer *timer, + rte_htimer_mgr_async_op_cb_t async_cb, + void *async_cb_arg); + +/** + * Update HTW time and perform timer expiry and asynchronous operation + * processing. + * + * This function is the equivalent of retrieving the current TSC time, + * and calling rte_htimer_mgr_manage_time(). + * + * rte_htimer_mgr_manage() is multi-thread safe, and may only be + * called from an EAL thread or a registered non-EAL thread. + */ + +__rte_experimental +void +rte_htimer_mgr_manage(void); + +/** + * Progress HTW time, and perform timer expiry and asynchronous + * operation processing in the process. + * + * This function progress the calling thread's HTW up to the point + * specified by \c current_time, calling the callbacks of any expired + * timers. + * + * The time source must be a monotonic clock, and thus each new \c + * current_time must be equal or greater than the time supplied in the + * previous call. + * + * The timer precision for timers scheduled on a particular thread's + * HTW depends on that threads call frequency to this function. + * + * rte_htimer_mgr_manage_time() also performs asynchronous operation + * processing. See rte_htimer_mgr_process() for details. + * + * rte_htimer_mgr_manage_time() is multi-thread safe, and may only be + * called from an EAL thread or a registered non-EAL thread. + * + * @param current_time + * The current time (in nanoseconds, by default). + * @param flags + * Either \c RTE_HTIMER_FLAG_TIME_TICK or \c RTE_HTIMER_FLAG_TIME_TSC. + */ + +__rte_experimental +void +rte_htimer_mgr_manage_time(uint64_t current_time, uint32_t flags); + +/** + * Perform asynchronous operation processing. + * + * rte_htimer_mgr_process() serves pending asynchronous add or cancel + * requests, and produces the necessary responses. The timer callbacks + * of already-expired timers added are called. + * + * This function also processes asynchronous operation response + * messages received, and calls the asynchronous callbacks, if such + * was provided by the application. + * + * rte_htimer_mgr_process() is multi-thread safe, and may only be + * called from an EAL thread or a registered non-EAL thread. + */ + +__rte_experimental +void +rte_htimer_mgr_process(void); + +/** + * Return the current local HTW time in nanoseconds. + * + * This function returns the most recent time provided by this thread, + * either via rte_htimer_mgr_manage_time(), or as sampled by + * rte_htimer_mgr_manage(). + * + * The initial time, prior to any manage-calls, is 0. + * + * rte_htimer_mgr_current_time() is multi-thread safe, and may only be + * called from an EAL thread or a registered non-EAL thread. + */ + +__rte_experimental +uint64_t +rte_htimer_mgr_current_time(void); + +/** + * Return the current local HTW time in ticks. + * + * This function returns the current time of the calling thread's HTW. The + * tick is the current time provided by the application (via + * rte_htimer_mgr_manage_time()), or as retrieved (using + * rte_timer_get_tsc_cycles() in rte_htimer_mgr_manage()), divided by the + * tick length (as provided in rte_htimer_mgr_init()). + * + * The initial time, prior to any manage-calls, is 0. + * + * rte_htimer_mgr_current_tick() is multi-thread safe, and may only be + * called from an EAL thread or a registered non-EAL thread. + */ + +__rte_experimental +uint64_t +rte_htimer_mgr_current_tick(void); + +#endif diff --git a/lib/htimer/rte_htimer_msg.h b/lib/htimer/rte_htimer_msg.h new file mode 100644 index 0000000000..ceb106e263 --- /dev/null +++ b/lib/htimer/rte_htimer_msg.h @@ -0,0 +1,44 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2023 Ericsson AB + */ + +#ifndef _RTE_HTIMER_MSG_ +#define _RTE_HTIMER_MSG_ + +#include + +typedef void (*rte_htimer_msg_async_op_cb_t)(struct rte_htimer *timer, + int result, void *cb_arg); + +typedef rte_htimer_msg_async_op_cb_t async_cb; + +enum rte_htimer_msg_type { + rte_htimer_msg_type_add_request, + rte_htimer_msg_type_add_response, + rte_htimer_msg_type_cancel_request, + rte_htimer_msg_type_cancel_response +}; + +struct rte_htimer_msg_request { + unsigned int source_lcore_id; +}; + +struct rte_htimer_msg_response { + int result; +}; + +struct rte_htimer_msg { + enum rte_htimer_msg_type msg_type; + + struct rte_htimer *timer; + + rte_htimer_msg_async_op_cb_t async_cb; + void *async_cb_arg; + + union { + struct rte_htimer_msg_request request; + struct rte_htimer_msg_response response; + }; +}; + +#endif diff --git a/lib/htimer/rte_htimer_msg_ring.c b/lib/htimer/rte_htimer_msg_ring.c new file mode 100644 index 0000000000..4019b7819a --- /dev/null +++ b/lib/htimer/rte_htimer_msg_ring.c @@ -0,0 +1,18 @@ +#include "rte_htimer_msg_ring.h" + +struct rte_htimer_msg_ring * +rte_htimer_msg_ring_create(const char *name, unsigned int count, int socket_id, + unsigned int flags) +{ + struct rte_ring *ring = + rte_ring_create_elem(name, sizeof(struct rte_htimer_msg), + count, socket_id, flags); + + return (struct rte_htimer_msg_ring *)ring; +} + +void +rte_htimer_msg_ring_free(struct rte_htimer_msg_ring *msg_ring) +{ + rte_ring_free((struct rte_ring *)msg_ring); +} diff --git a/lib/htimer/rte_htimer_msg_ring.h b/lib/htimer/rte_htimer_msg_ring.h new file mode 100644 index 0000000000..48fcc99189 --- /dev/null +++ b/lib/htimer/rte_htimer_msg_ring.h @@ -0,0 +1,55 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2023 Ericsson AB + */ + +#ifndef _RTE_HTIMER_MSG_RING_ +#define _RTE_HTIMER_MSG_RING_ + +#include + +#include "rte_htimer_msg.h" + +struct rte_htimer_msg_ring { + struct rte_ring ring; +}; + +struct rte_htimer_msg_ring * +rte_htimer_msg_ring_create(const char *name, unsigned int count, int socket_id, + unsigned int flags); + +void +rte_htimer_msg_ring_free(struct rte_htimer_msg_ring *msg_ring); + +static inline int +rte_htimer_msg_ring_empty(struct rte_htimer_msg_ring *msg_ring) +{ + return rte_ring_empty(&msg_ring->ring); +} + +static inline unsigned int +rte_htimer_msg_ring_dequeue_burst(struct rte_htimer_msg_ring *msg_ring, + struct rte_htimer_msg *msgs, + unsigned int n) +{ + unsigned int dequeued; + + dequeued = rte_ring_dequeue_burst_elem(&msg_ring->ring, msgs, + sizeof(struct rte_htimer_msg), + n, NULL); + + return dequeued; +} + +static inline unsigned int +rte_htimer_msg_ring_enqueue(struct rte_htimer_msg_ring *msg_ring, + struct rte_htimer_msg *msg) +{ + int rc; + + rc = rte_ring_enqueue_elem(&msg_ring->ring, msg, + sizeof(struct rte_htimer_msg)); + + return rc; +} + +#endif diff --git a/lib/htimer/rte_htw.c b/lib/htimer/rte_htw.c new file mode 100644 index 0000000000..67fcb8c623 --- /dev/null +++ b/lib/htimer/rte_htw.c @@ -0,0 +1,445 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2023 Ericsson AB + */ + +/* + * This is an implementation of a hierarchical timer wheel based on + * Hashed and Hierarchical Timing Wheels: Data Structures + * for the Efficient Implementation of a Timer Facility by Varghese and + * Lauck. + * + * To improve efficiency when the slots are sparsely populate (i.e., + * many ticks do not have any timers), each slot is represented by a + * bit in a separately-managed, per-wheel, bitset. This allows for + * very efficient scanning. The cost of managing this bitset is small. + */ + +#include +#include +#include +#include +#include + +#include "rte_htw.h" + +#define TICK_BITS 64 + +#define WHEEL_BITS 8 +#define WHEEL_SLOTS (1U << WHEEL_BITS) +#define WHEEL_LEVELS (TICK_BITS / WHEEL_BITS) + +struct wheel { + uint64_t wheel_time; + RTE_BITSET_DECLARE(used_slots, WHEEL_SLOTS); + struct rte_htimer_list slots[WHEEL_SLOTS]; +}; + +struct rte_htw { + uint64_t current_time; + + struct wheel wheels[WHEEL_LEVELS]; + + struct rte_htimer_list added; + struct rte_htimer_list expiring; + + struct rte_htimer *running_timer; +}; + +static uint64_t +time_to_wheel_time(uint64_t t, uint16_t level) +{ + return t >> (level * WHEEL_BITS); +} + +static uint64_t +wheel_time_to_time(uint64_t wheel_time, uint16_t level) +{ + return wheel_time << (level * WHEEL_BITS); +} + +static void +wheel_init(struct wheel *wheel) +{ + uint16_t i; + + wheel->wheel_time = 0; + + rte_bitset_init(wheel->used_slots, WHEEL_SLOTS); + + for (i = 0; i < WHEEL_SLOTS; i++) + LIST_INIT(&wheel->slots[i]); +} + +static uint64_t +list_next_timeout(struct rte_htimer_list *timers) +{ + struct rte_htimer *timer; + uint64_t candidate = UINT64_MAX; + + LIST_FOREACH(timer, timers, entry) + candidate = RTE_MIN(timer->expiration_time, candidate); + + return candidate; +} + +static uint16_t +wheel_time_to_slot(uint64_t wheel_time) +{ + return wheel_time % WHEEL_SLOTS; +} + +static uint64_t +wheel_current_slot_time(struct wheel *wheel, uint16_t level) +{ + return wheel->wheel_time << (level * WHEEL_BITS); +} + +static uint64_t +wheel_next_timeout(struct wheel *wheel, uint16_t level, uint64_t upper_bound) +{ + uint16_t start_slot; + ssize_t slot; + + start_slot = wheel_current_slot_time(wheel, level); + + if (wheel_time_to_time(wheel->wheel_time, level) >= upper_bound) + return upper_bound; + + RTE_BITSET_FOREACH_SET_WRAP(slot, wheel->used_slots, WHEEL_SLOTS, + (ssize_t)start_slot, WHEEL_SLOTS) { + struct rte_htimer_list *timers = &wheel->slots[slot]; + uint64_t next_timeout; + + next_timeout = list_next_timeout(timers); + + if (next_timeout != UINT64_MAX) + return next_timeout; + } + + return UINT64_MAX; +} + +static uint16_t +get_slot(uint64_t t, uint16_t level) +{ + uint64_t wheel_time; + uint16_t slot; + + wheel_time = time_to_wheel_time(t, level); + slot = wheel_time_to_slot(wheel_time); + + return slot; +} + +struct rte_htw * +rte_htw_create(void) +{ + struct rte_htw *htw; + uint16_t level; + + RTE_BUILD_BUG_ON((TICK_BITS % WHEEL_BITS) != 0); + RTE_BUILD_BUG_ON(sizeof(uint16_t) * CHAR_BIT <= WHEEL_BITS); + + htw = rte_malloc(NULL, sizeof(struct rte_htw), RTE_CACHE_LINE_SIZE); + + if (htw == NULL) { + rte_errno = ENOMEM; + return NULL; + } + + htw->current_time = 0; + + LIST_INIT(&htw->added); + LIST_INIT(&htw->expiring); + + for (level = 0; level < WHEEL_LEVELS; level++) + wheel_init(&htw->wheels[level]); + + return htw; +} + +void +rte_htw_destroy(struct rte_htw *htw) +{ + rte_free(htw); +} + +static uint16_t +get_level(uint64_t remaining_time) +{ + int last_set = 64 - __builtin_clzll(remaining_time); + + return (last_set - 1) / WHEEL_BITS; +} + +static void +mark_added(struct rte_htw *htw, struct rte_htimer *timer) +{ + timer->state = RTE_HTIMER_STATE_PENDING; + LIST_INSERT_HEAD(&htw->added, timer, entry); +} + +static void +assure_valid_add_params(uint64_t period __rte_unused, + uint32_t flags __rte_unused) +{ + RTE_ASSERT(!(flags & ~(RTE_HTIMER_FLAG_PERIODICAL | + RTE_HTIMER_FLAG_ABSOLUTE_TIME))); + RTE_ASSERT(flags & RTE_HTIMER_FLAG_PERIODICAL ? + period > 0 : period == 0); +} + +void +rte_htw_add(struct rte_htw *htw, struct rte_htimer *timer, + uint64_t expiration_time, uint64_t period, + rte_htimer_cb_t timer_cb, void *timer_cb_arg, uint32_t flags) +{ + assure_valid_add_params(period, flags); + + if (flags & RTE_HTIMER_FLAG_ABSOLUTE_TIME) + timer->expiration_time = expiration_time; + else + timer->expiration_time = htw->current_time + expiration_time; + + timer->period = period; + timer->flags = flags; + timer->cb = timer_cb; + timer->cb_arg = timer_cb_arg; + + mark_added(htw, timer); +} + +void +rte_htw_cancel(struct rte_htw *htw, struct rte_htimer *timer) +{ + /* + * One could consider clearing the relevant used_slots bit in + * case this was the last entry in the wheel's slot + * list. However, from a correctness point of view, a "false + * positive" is not an issue. From a performance perspective, + * checking the list head and clearing the bit is likely more + * expensive than just deferring a minor cost to a future + * rte_htw_manage() call. + */ + + RTE_ASSERT(timer->state == RTE_HTIMER_STATE_PENDING || + timer->state == RTE_HTIMER_STATE_EXPIRED); + + if (likely(timer->state == RTE_HTIMER_STATE_PENDING)) { + LIST_REMOVE(timer, entry); + timer->state = RTE_HTIMER_STATE_CANCELED; + } else if (timer == htw->running_timer) { + /* periodical timer being canceled by its own callback */ + RTE_ASSERT(timer->flags & RTE_HTIMER_FLAG_PERIODICAL); + + timer->state = RTE_HTIMER_STATE_CANCELED; + + /* signals running timer canceled */ + htw->running_timer = NULL; + } +} + +static void +mark_expiring(struct rte_htw *htw, struct rte_htimer *timer) +{ + LIST_INSERT_HEAD(&htw->expiring, timer, entry); +} + +static void +schedule_timer(struct rte_htw *htw, struct rte_htimer *timer) +{ + uint64_t remaining_time; + uint16_t level; + struct wheel *wheel; + uint16_t slot; + struct rte_htimer_list *slot_timers; + + remaining_time = timer->expiration_time - htw->current_time; + + level = get_level(remaining_time); + + wheel = &htw->wheels[level]; + + slot = get_slot(timer->expiration_time, level); + + slot_timers = &htw->wheels[level].slots[slot]; + + LIST_INSERT_HEAD(slot_timers, timer, entry); + + rte_bitset_set(wheel->used_slots, slot); +} + +static void +process_added(struct rte_htw *htw) +{ + struct rte_htimer *timer; + + while ((timer = LIST_FIRST(&htw->added)) != NULL) { + LIST_REMOVE(timer, entry); + + if (timer->expiration_time > htw->current_time) + schedule_timer(htw, timer); + else + mark_expiring(htw, timer); + } +} + +static void +process_expiring(struct rte_htw *htw) +{ + struct rte_htimer *timer; + + while ((timer = LIST_FIRST(&htw->expiring)) != NULL) { + bool is_periodical; + bool running_timer_canceled; + + /* + * The timer struct may cannot be safely accessed + * after the callback has been called (except for + * non-canceled periodical timers), since the callback + * may have free'd (or reused) the memory. + */ + + LIST_REMOVE(timer, entry); + + is_periodical = timer->flags & RTE_HTIMER_FLAG_PERIODICAL; + + timer->state = RTE_HTIMER_STATE_EXPIRED; + + htw->running_timer = timer; + + timer->cb(timer, timer->cb_arg); + + running_timer_canceled = htw->running_timer == NULL; + + htw->running_timer = NULL; + + if (is_periodical && !running_timer_canceled) { + timer->expiration_time += timer->period; + mark_added(htw, timer); + } + } +} + +uint64_t +rte_htw_current_time(struct rte_htw *htw) +{ + return htw->current_time; +} + +uint64_t +rte_htw_next_timeout(struct rte_htw *htw, uint64_t upper_bound) +{ + uint16_t level; + + /* scheduling timeouts will sort them in temporal order */ + process_added(htw); + + if (!LIST_EMPTY(&htw->expiring)) + return 0; + + for (level = 0; level < WHEEL_LEVELS; level++) { + uint64_t wheel_timeout; + + wheel_timeout = wheel_next_timeout(&htw->wheels[level], + level, upper_bound); + if (wheel_timeout != UINT64_MAX) + return RTE_MIN(wheel_timeout, upper_bound); + } + + return upper_bound; +} + +static __rte_always_inline void +process_slot(struct rte_htw *htw, uint16_t level, struct wheel *wheel, + uint16_t slot) +{ + struct rte_htimer_list *slot_timers; + struct rte_htimer *timer; + + slot_timers = &wheel->slots[slot]; + + rte_bitset_clear(wheel->used_slots, slot); + + while ((timer = LIST_FIRST(slot_timers)) != NULL) { + LIST_REMOVE(timer, entry); + + if (level == 0 || timer->expiration_time <= htw->current_time) + mark_expiring(htw, timer); + else + schedule_timer(htw, timer); + } +} + +static __rte_always_inline void +process_slots(struct rte_htw *htw, uint16_t level, struct wheel *wheel, + uint16_t start_slot, uint16_t num_slots) +{ + ssize_t slot; + + RTE_BITSET_FOREACH_SET_WRAP(slot, wheel->used_slots, WHEEL_SLOTS, + (ssize_t)start_slot, num_slots) + process_slot(htw, level, wheel, slot); +} + +static void +advance(struct rte_htw *htw) +{ + uint16_t level; + + for (level = 0; level < WHEEL_LEVELS; level++) { + struct wheel *wheel = &htw->wheels[level]; + uint64_t new_wheel_time; + uint16_t start_slot; + uint16_t num_slots; + + new_wheel_time = time_to_wheel_time(htw->current_time, level); + + if (new_wheel_time == wheel->wheel_time) + break; + + start_slot = wheel_time_to_slot(wheel->wheel_time + 1); + num_slots = RTE_MIN(new_wheel_time - wheel->wheel_time, + WHEEL_SLOTS); + + wheel->wheel_time = new_wheel_time; + + process_slots(htw, level, wheel, start_slot, num_slots); + } +} + +void +rte_htw_manage(struct rte_htw *htw, uint64_t new_time) +{ + RTE_VERIFY(new_time >= htw->current_time); + + /* + * Scheduling added timers, core timer wheeling processing and + * expiry callback execution is kept as separate stages, to + * avoid having the core wheel traversal code to deal with a + * situation where a timeout callbacks re-adding the timer. + * This split also results in seemingly reasonable semantics + * in regards to the execution of the callbacks of + * already-expired timeouts (e.g., with time 0) being added in + * a timeout callback. Instead of creating an end-less loop, + * with rte_htw_manage() never returning, it defers the + * execution of the timer until the next rte_htw_manage() + * call. + */ + + process_added(htw); + + if (new_time > htw->current_time) { + htw->current_time = new_time; + advance(htw); + } + + process_expiring(htw); +} + +void +rte_htw_process(struct rte_htw *htw) +{ + process_added(htw); + process_expiring(htw); +} diff --git a/lib/htimer/rte_htw.h b/lib/htimer/rte_htw.h new file mode 100644 index 0000000000..c93358bb13 --- /dev/null +++ b/lib/htimer/rte_htw.h @@ -0,0 +1,49 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2023 Ericsson AB + */ + +#ifndef _RTE_HTW_H_ +#define _RTE_HTW_H_ + +#include +#include + +#include + +#ifdef __cplusplus +extern "C" { +#endif + +struct rte_htw; + +struct rte_htw * +rte_htw_create(void); + +void +rte_htw_destroy(struct rte_htw *htw); + +void +rte_htw_add(struct rte_htw *htw, struct rte_htimer *timer, + uint64_t expiration_time, uint64_t period, + rte_htimer_cb_t cb, void *cb_arg, uint32_t flags); + +void +rte_htw_cancel(struct rte_htw *htw, struct rte_htimer *timer); + +uint64_t +rte_htw_current_time(struct rte_htw *htw); + +uint64_t +rte_htw_next_timeout(struct rte_htw *htw, uint64_t upper_bound); + +void +rte_htw_manage(struct rte_htw *htw, uint64_t new_time); + +void +rte_htw_process(struct rte_htw *htw); + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_HTW_H_ */ diff --git a/lib/htimer/version.map b/lib/htimer/version.map new file mode 100644 index 0000000000..0e71dc7d57 --- /dev/null +++ b/lib/htimer/version.map @@ -0,0 +1,17 @@ +EXPERIMENTAL { + global: + + rte_htimer_mgr_init; + rte_htimer_mgr_deinit; + rte_htimer_mgr_add; + rte_htimer_mgr_cancel; + rte_htimer_mgr_async_add; + rte_htimer_mgr_async_cancel; + rte_htimer_mgr_manage; + rte_htimer_mgr_manage_time; + rte_htimer_mgr_process; + rte_htimer_mgr_current_time; + rte_htimer_mgr_current_tick; + + local: *; +}; diff --git a/lib/meson.build b/lib/meson.build index 2bc0932ad5..c7c0e42ae8 100644 --- a/lib/meson.build +++ b/lib/meson.build @@ -37,6 +37,7 @@ libraries = [ 'gpudev', 'gro', 'gso', + 'htimer', 'ip_frag', 'jobstats', 'kni', -- 2.34.1