From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ig0-f173.google.com (mail-ig0-f173.google.com [209.85.213.173]) by dpdk.org (Postfix) with ESMTP id C46ADC5DC for ; Tue, 28 Jul 2015 00:46:25 +0200 (CEST) Received: by iggf3 with SMTP id f3so97554655igg.1 for ; Mon, 27 Jul 2015 15:46:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=cQnNEA8igcM0csX2nVm6UE3A77yPI6d9rMB4NKbnq5k=; b=qgUjpHsZ4JZLNHTDORMDwT2/L/X9xk/jJcPRIwPI9+VAIGbrQ8uCnTIP6UoVTc8Wny ra9tCp46jfVy4ybBwOweITVDefE/2IKRwqQ7gHSZL1WyTTldT0doD8hownc84Z6La/Hu hOn8/5WCGjdzJvYtIqX/YN0ZSQBU9UCXa9dSnHBMXLOsTwcJdp+ww7C0ifdWRXDrz3vZ O7HWA6Tg04aLNoUU4M+rl4bbQ9ySdF/TmtjHVXY8tRMR+58IuIFtaJoCup1OeVHdRW7P oI5q8JK5whtVJ5SSeOb6ufXq9b5QrS9CR0z+/f8UWJ1E/djBrUnhvPuTlUBFItax7Mcy iXGQ== X-Received: by 10.107.40.147 with SMTP id o141mr47799987ioo.83.1438037185153; Mon, 27 Jul 2015 15:46:25 -0700 (PDT) Received: from localhost.localdomain ([23.79.237.14]) by smtp.gmail.com with ESMTPSA id j18sm7063061igf.2.2015.07.27.15.46.24 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 27 Jul 2015 15:46:24 -0700 (PDT) From: rsanford2@gmail.com To: dev@dpdk.org Date: Mon, 27 Jul 2015 18:46:05 -0400 Message-Id: <1438037168-639-3-git-send-email-rsanford2@gmail.com> X-Mailer: git-send-email 1.7.1 In-Reply-To: <1437691347-58708-1-git-send-email-rsanford2@gmail.com> References: <1437691347-58708-1-git-send-email-rsanford2@gmail.com> Subject: [dpdk-dev] [PATCH v2 2/3] timer: add timer-manage race condition test X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Jul 2015 22:46:26 -0000 From: Robert Sanford Add new timer-manage race-condition test: We wrote a test to confirm our suspicion that we could crash rte_timer_manage() under the right circumstances. We repeatedly set several timers to expire at roughly the same time on the master core. The master lcore just delays and runs rte_timer_manage() about ten times per second. The slave lcores all watch the first timer (timer-0) to see when rte_timer_manage() is running on the master, i.e., timer-0's state is not PENDING. At this point, each slave attempts to reset a subset of the timers to a later expiration time. The goal here is to have the slaves moving most of the timers to a different place in the master's pending-list, while the master is traversing the same next-pointers (the slaves' sl_next[0] pointers) and running callback functions. This eventually results in the master traversing a corrupted linked-list. In our observations, it results in an infinite loop. Signed-off-by: Robert Sanford --- app/test/Makefile | 1 + app/test/test_timer_racecond.c | 209 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 210 insertions(+), 0 deletions(-) create mode 100644 app/test/test_timer_racecond.c diff --git a/app/test/Makefile b/app/test/Makefile index caa359c..e7f148f 100644 --- a/app/test/Makefile +++ b/app/test/Makefile @@ -71,6 +71,7 @@ SRCS-y += test_rwlock.c SRCS-$(CONFIG_RTE_LIBRTE_TIMER) += test_timer.c SRCS-$(CONFIG_RTE_LIBRTE_TIMER) += test_timer_perf.c +SRCS-$(CONFIG_RTE_LIBRTE_TIMER) += test_timer_racecond.c SRCS-y += test_mempool.c SRCS-y += test_mempool_perf.c diff --git a/app/test/test_timer_racecond.c b/app/test/test_timer_racecond.c new file mode 100644 index 0000000..32693d8 --- /dev/null +++ b/app/test/test_timer_racecond.c @@ -0,0 +1,209 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2015 Akamai Technologies. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Intel Corporation nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "test.h" + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#undef TEST_TIMER_RACECOND_VERBOSE + +#ifdef RTE_EXEC_ENV_LINUXAPP +#define usec_delay(us) usleep(us) +#else +#define usec_delay(us) rte_delay_us(us) +#endif + +#define BILLION (1UL << 30) + +#define TEST_DURATION_S 20 /* in seconds */ +#define N_TIMERS 50 + +static struct rte_timer timer[N_TIMERS]; +static unsigned timer_lcore_id[N_TIMERS]; + +static unsigned master; +static volatile unsigned stop_slaves; + +static int reload_timer(struct rte_timer *tim); + +static void +timer_cb(struct rte_timer *tim, void *arg __rte_unused) +{ + /* Simulate slow callback function, 100 us. */ + rte_delay_us(100); + +#ifdef TEST_TIMER_RACECOND_VERBOSE + if (tim == &timer[0]) + printf("------------------------------------------------\n"); + printf("timer_cb: core %u timer %lu\n", + rte_lcore_id(), tim - timer); +#endif + (void)reload_timer(tim); +} + +RTE_DEFINE_PER_LCORE(unsigned, n_reset_collisions); + +static int +reload_timer(struct rte_timer *tim) +{ + /* Make timer expire roughly when the TSC hits the next BILLION + * multiple. Add in timer's index to make them expire in nearly + * sorted order. This makes all timers somewhat synchronized, + * firing ~2-3 times per second, assuming 2-3 GHz TSCs. + */ + uint64_t ticks = BILLION - (rte_get_timer_cycles() % BILLION) + + (tim - timer); + int ret; + + ret = rte_timer_reset(tim, ticks, PERIODICAL, master, timer_cb, NULL); + if (ret != 0) { +#ifdef TEST_TIMER_RACECOND_VERBOSE + printf("- core %u failed to reset timer %lu (OK)\n", + rte_lcore_id(), tim - timer); +#endif + RTE_PER_LCORE(n_reset_collisions) += 1; + } + return ret; +} + +static int +slave_main_loop(__attribute__((unused)) void *arg) +{ + unsigned lcore_id = rte_lcore_id(); + unsigned i; + + RTE_PER_LCORE(n_reset_collisions) = 0; + + printf("Starting main loop on core %u\n", lcore_id); + + while (!stop_slaves) { + /* Wait until the timer manager is running. + * We know it's running when we see timer[0] NOT pending. + */ + if (rte_timer_pending(&timer[0])) { + rte_pause(); + continue; + } + + /* Now, go cause some havoc! + * Reload our timers. + */ + for (i = 0; i < N_TIMERS; i++) { + if (timer_lcore_id[i] == lcore_id) + (void)reload_timer(&timer[i]); + } + usec_delay(100*1000); /* sleep 100 ms */ + } + + if (RTE_PER_LCORE(n_reset_collisions) != 0) { + printf("- core %u, %u reset collisions (OK)\n", + lcore_id, RTE_PER_LCORE(n_reset_collisions)); + } + return 0; +} + +static int +test_timer_racecond(void) +{ + int ret; + uint64_t hz; + uint64_t cur_time; + uint64_t end_time; + int64_t diff = 0; + unsigned lcore_id; + unsigned i; + + master = lcore_id = rte_lcore_id(); + hz = rte_get_timer_hz(); + + /* init and start timers */ + for (i = 0; i < N_TIMERS; i++) { + rte_timer_init(&timer[i]); + ret = reload_timer(&timer[i]); + TEST_ASSERT(ret == 0, "reload_timer failed"); + + /* Distribute timers to slaves. + * Note that we assign timer[0] to the master. + */ + timer_lcore_id[i] = lcore_id; + lcore_id = rte_get_next_lcore(lcore_id, 1, 1); + } + + /* calculate the "end of test" time */ + cur_time = rte_get_timer_cycles(); + end_time = cur_time + (hz * TEST_DURATION_S); + + /* start slave cores */ + stop_slaves = 0; + printf("Start timer manage race condition test (%u seconds)\n", + TEST_DURATION_S); + rte_eal_mp_remote_launch(slave_main_loop, NULL, SKIP_MASTER); + + while (diff >= 0) { + /* run the timers */ + rte_timer_manage(); + + /* wait 100 ms */ + usec_delay(100*1000); + + cur_time = rte_get_timer_cycles(); + diff = end_time - cur_time; + } + + /* stop slave cores */ + printf("Stopping timer manage race condition test\n"); + stop_slaves = 1; + rte_eal_mp_wait_lcore(); + + /* stop timers */ + for (i = 0; i < N_TIMERS; i++) { + ret = rte_timer_stop(&timer[i]); + TEST_ASSERT(ret == 0, "rte_timer_stop failed"); + } + + return TEST_SUCCESS; +} + +static struct test_command timer_racecond_cmd = { + .command = "timer_racecond_autotest", + .callback = test_timer_racecond, +}; +REGISTER_TEST_COMMAND(timer_racecond_cmd); -- 1.7.1