From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ig0-f172.google.com (mail-ig0-f172.google.com [209.85.213.172]) by dpdk.org (Postfix) with ESMTP id 855BF376D for ; Fri, 24 Jul 2015 00:42:44 +0200 (CEST) Received: by igvi1 with SMTP id i1so22241003igv.1 for ; Thu, 23 Jul 2015 15:42:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:date:message-id; bh=zH5i6xFb/1PWFYe943/BSZ8CHPH0CALhDPQmjI6zd7Y=; b=eS5uqi0yInvUT7UDf9oMESRX7ahlhPeYbwU3qYY71C2RxkZNnJhrD2KZjxjMZk+gpy mu4hE9mn8e1OFvPqjuBFNQD+B/FANxzLRc5LX4e09oCJfiik8bXy5SuK0sopPdGtWgEr fkaoA3F6vInax6DIW03m/+Y9BTQqWMYOppM1R2zBJ4d7rbpA4jhyJaxIi0JhN0/mHNqF UOFfr/8r1TF2OPGBipQJVY0TuvHP8DMrFNgvC2xBrmLLOSwKLTxSpa7aOU6y19PHmgw+ nOQ65s1AbWL13T8hd/dPfnxl68gTgPaU0lq/pg9TfPSeUnJVEzUNyOzoAWJllS6uymSS 2MPQ== X-Received: by 10.50.33.19 with SMTP id n19mr548366igi.26.1437691363982; Thu, 23 Jul 2015 15:42:43 -0700 (PDT) Received: from localhost.localdomain ([23.79.237.14]) by smtp.gmail.com with ESMTPSA id j20sm254990igt.16.2015.07.23.15.42.43 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 23 Jul 2015 15:42:43 -0700 (PDT) From: rsanford2@gmail.com To: dev@dpdk.org Date: Thu, 23 Jul 2015 18:42:24 -0400 Message-Id: <1437691347-58708-1-git-send-email-rsanford2@gmail.com> X-Mailer: git-send-email 1.7.1 Subject: [dpdk-dev] [PATCH 0/3] timer: fix rte_timer_manage and improve unit tests X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Jul 2015 22:42:44 -0000 From: Robert Sanford This patchset fixes a bug in timer stress test 2, adds a new stress test to expose a race condition bug in API rte_timer_manage(), and then fixes the rte_timer_manage() bug. -- Patch 1, app/test timer stress test 2: Sometimes this test fails and seg-faults because the slave lcores get out of phase with the master. The master uses a single int, 'ready', to synchronize multiple slave lcores through multiple phases of the test. To resolve, we construct simple synchronization primitives that use one atomic-int state variable per slave. The master tells the slaves when to start, and then waits for all of them to finish. Each slave waits for the master to tell it to start, and then tells the master when it has finished. -- Description of rte_timer_manage() race condition bug: Through code inspection, we noticed a potential problem in rte_timer_manage() that leads to corruption of per-lcore pending-lists (implemented as skip-lists). The race condition occurs when rte_timer_manage() expires multiple timers on lcore A, while lcore B simultaneously invokes rte_timer_reset() for one of the expiring timers (other than the first one). Lcore A splits its pending-list, creating a local list of expired timers linked through their sl_next[0] pointers, and sets the first expired timer to the RUNNING state, all during one list-lock round trip. Lcore A then unlocks the list-lock to run the first callback, and that is when lcore B can misinterpret the subsequent expired timers' true states. Lcore B sees an expired timer still in the PENDING state, locks A's list-lock, and reinserts the timer into A's pending-list. We are trying to use the same pointer to belong to both lists! One solution is to remove timers from the pending-list and set them to the RUNNING state in one atomic step, i.e., we should perform these two actions within one ownership of the list-lock. -- Patch 2, new timer-manage race-condition test: We wrote a test to confirm our suspicion that we could crash rte_timer_manage() (RTM) under the right circumstances. We repeatedly set several timers to expire at roughly the same time on the master core. The master lcore just delays and runs RTM about ten times per second. The slave lcores all watch the first timer (timer-0) to see when RTM is running on the master, i.e., timer-0's state is not PENDING. At this point, each slave attempts to reset a subset of the timers to a later expiration time. The goal here is to have the slaves moving most of the timers to a different place in the master's pending-list, while the master is working with the same next-pointers and running callback functions. This eventually results in the master traversing a corrupted linked-list. In our observations, it results in an infinite loop. -- Patch 3, eliminate race condition in rte_timer_manage(): After splitting the pending-list at the current time point, we try to set all expired timers to the RUNNING state, and put back into the pending-list any timers that we fail to set to the RUNNING state, all during one list-lock round trip. It is then safe to run the callback functions for all expired timers that remain on our local list, without the lock. Robert Sanford (3): timer: fix stress test 2 synchronization bug timer: add timer-manage race condition test timer: fix race condition in rte_timer_manage() app/test/Makefile | 1 + app/test/test_timer.c | 149 ++++++++++++++++++++++------- app/test/test_timer_racecond.c | 209 ++++++++++++++++++++++++++++++++++++++++ lib/librte_timer/rte_timer.c | 45 ++++++--- 4 files changed, 355 insertions(+), 49 deletions(-) create mode 100644 app/test/test_timer_racecond.c