From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mellanox.co.il (mail-il-dmz.mellanox.com [193.47.165.129]) by dpdk.org (Postfix) with ESMTP id C5A3B2B91 for ; Fri, 8 Mar 2019 18:48:17 +0100 (CET) Received: from Internal Mail-Server by MTLPINE1 (envelope-from yskoh@mellanox.com) with ESMTPS (AES256-SHA encrypted); 8 Mar 2019 19:48:15 +0200 Received: from scfae-sc-2.mti.labs.mlnx (scfae-sc-2.mti.labs.mlnx [10.101.0.96]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id x28HloAU002625; Fri, 8 Mar 2019 19:48:13 +0200 From: Yongseok Koh To: Erik Gabriel Carrillo Cc: Gavin Hu , dpdk stable Date: Fri, 8 Mar 2019 09:46:52 -0800 Message-Id: <20190308174749.30771-14-yskoh@mellanox.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20190308174749.30771-1-yskoh@mellanox.com> References: <20190308174749.30771-1-yskoh@mellanox.com> Subject: [dpdk-stable] patch 'timer: fix race condition' has been queued to LTS release 17.11.6 X-BeenThere: stable@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches for DPDK stable branches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Mar 2019 17:48:18 -0000 Hi, FYI, your patch has been queued to LTS release 17.11.6 Note it hasn't been pushed to http://dpdk.org/browse/dpdk-stable yet. It will be pushed if I get no objection by 03/13/19. So please shout if anyone has objection. Also note that after the patch there's a diff of the upstream commit vs the patch applied to the branch. If the code is different (ie: not only metadata diffs), due for example to a change in context or macro names, please double check it. Thanks. Yongseok --- >>From 692b6919438f84d1fb1f422cb56d7dc275601337 Mon Sep 17 00:00:00 2001 From: Erik Gabriel Carrillo Date: Wed, 19 Dec 2018 10:09:34 -0600 Subject: [PATCH] timer: fix race condition [ upstream commit 7079e29f7f28661b712620f46e6a8514eb0a708a ] rte_timer_manage() adds expired timers to a "run list", and walks the list, transitioning each timer from the PENDING to the RUNNING state. If another lcore resets or stops the timer at precisely this moment, the timer state would instead be set to CONFIG by that other lcore, which would cause timer_manage() to skip over it. This is expected behavior. However, if a timer expires quickly enough, there exists the following race condition that causes the timer_manage() routine to misinterpret a timer in CONFIG state, resulting in lost timers: - Thread A: - starts a timer with rte_timer_reset() - the timer is moved to CONFIG state - the spinlock associated with the appropriate skiplist is acquired - timer is inserted into the skiplist - the spinlock is released - Thread B: - executes rte_timer_manage() - find above timer as expired, add it to run list - walk run list, see above timer still in CONFIG state, unlink it from run list and continue on - Thread A: - move timer to PENDING state - return from rte_timer_reset() - timer is now in PENDING state, but not actually linked into a pending list or a run list and will never get processed further by rte_timer_manage() This commit fixes this race condition by only releasing the spinlock after the timer state has been transitioned from CONFIG to PENDING, which prevents rte_timer_manage() from seeing an incorrect state. Fixes: 9b15ba895b9f ("timer: use a skip list") Signed-off-by: Erik Gabriel Carrillo Reviewed-by: Gavin Hu --- lib/librte_timer/rte_timer.c | 28 ++++++++++++++-------------- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/lib/librte_timer/rte_timer.c b/lib/librte_timer/rte_timer.c index ba436cd87..56f9d6275 100644 --- a/lib/librte_timer/rte_timer.c +++ b/lib/librte_timer/rte_timer.c @@ -270,24 +270,17 @@ timer_get_prev_entries_for_node(struct rte_timer *tim, unsigned tim_lcore, } } -/* - * add in list, lock if needed +/* call with lock held as necessary + * add in list * timer must be in config state * timer must not be in a list */ static void -timer_add(struct rte_timer *tim, unsigned tim_lcore, int local_is_locked) +timer_add(struct rte_timer *tim, unsigned int tim_lcore) { - unsigned lcore_id = rte_lcore_id(); unsigned lvl; struct rte_timer *prev[MAX_SKIPLIST_DEPTH+1]; - /* if timer needs to be scheduled on another core, we need to - * lock the list; if it is on local core, we need to lock if - * we are not called from rte_timer_manage() */ - if (tim_lcore != lcore_id || !local_is_locked) - rte_spinlock_lock(&priv_timer[tim_lcore].list_lock); - /* find where exactly this element goes in the list of elements * for each depth. */ timer_get_prev_entries(tim->expire, tim_lcore, prev); @@ -311,9 +304,6 @@ timer_add(struct rte_timer *tim, unsigned tim_lcore, int local_is_locked) * NOTE: this is not atomic on 32-bit*/ priv_timer[tim_lcore].pending_head.expire = priv_timer[tim_lcore].\ pending_head.sl_next[0]->expire; - - if (tim_lcore != lcore_id || !local_is_locked) - rte_spinlock_unlock(&priv_timer[tim_lcore].list_lock); } /* @@ -408,8 +398,15 @@ __rte_timer_reset(struct rte_timer *tim, uint64_t expire, tim->f = fct; tim->arg = arg; + /* if timer needs to be scheduled on another core, we need to + * lock the destination list; if it is on local core, we need to lock if + * we are not called from rte_timer_manage() + */ + if (tim_lcore != lcore_id || !local_is_locked) + rte_spinlock_lock(&priv_timer[tim_lcore].list_lock); + __TIMER_STAT_ADD(pending, 1); - timer_add(tim, tim_lcore, local_is_locked); + timer_add(tim, tim_lcore); /* update state: as we are in CONFIG state, only us can modify * the state so we don't need to use cmpset() here */ @@ -418,6 +415,9 @@ __rte_timer_reset(struct rte_timer *tim, uint64_t expire, status.owner = (int16_t)tim_lcore; tim->status.u32 = status.u32; + if (tim_lcore != lcore_id || !local_is_locked) + rte_spinlock_unlock(&priv_timer[tim_lcore].list_lock); + return 0; } -- 2.11.0 --- Diff of the applied patch vs upstream commit (please double-check if non-empty: --- --- - 2019-03-08 09:46:41.076676989 -0800 +++ 0014-timer-fix-race-condition.patch 2019-03-08 09:46:40.015398000 -0800 @@ -1,8 +1,10 @@ -From 7079e29f7f28661b712620f46e6a8514eb0a708a Mon Sep 17 00:00:00 2001 +From 692b6919438f84d1fb1f422cb56d7dc275601337 Mon Sep 17 00:00:00 2001 From: Erik Gabriel Carrillo Date: Wed, 19 Dec 2018 10:09:34 -0600 Subject: [PATCH] timer: fix race condition +[ upstream commit 7079e29f7f28661b712620f46e6a8514eb0a708a ] + rte_timer_manage() adds expired timers to a "run list", and walks the list, transitioning each timer from the PENDING to the RUNNING state. If another lcore resets or stops the timer at precisely this @@ -37,7 +39,6 @@ which prevents rte_timer_manage() from seeing an incorrect state. Fixes: 9b15ba895b9f ("timer: use a skip list") -Cc: stable@dpdk.org Signed-off-by: Erik Gabriel Carrillo Reviewed-by: Gavin Hu @@ -46,10 +47,10 @@ 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/lib/librte_timer/rte_timer.c b/lib/librte_timer/rte_timer.c -index 590488c7e..30c7b0ab4 100644 +index ba436cd87..56f9d6275 100644 --- a/lib/librte_timer/rte_timer.c +++ b/lib/librte_timer/rte_timer.c -@@ -241,24 +241,17 @@ timer_get_prev_entries_for_node(struct rte_timer *tim, unsigned tim_lcore, +@@ -270,24 +270,17 @@ timer_get_prev_entries_for_node(struct rte_timer *tim, unsigned tim_lcore, } } @@ -77,7 +78,7 @@ /* find where exactly this element goes in the list of elements * for each depth. */ timer_get_prev_entries(tim->expire, tim_lcore, prev); -@@ -282,9 +275,6 @@ timer_add(struct rte_timer *tim, unsigned tim_lcore, int local_is_locked) +@@ -311,9 +304,6 @@ timer_add(struct rte_timer *tim, unsigned tim_lcore, int local_is_locked) * NOTE: this is not atomic on 32-bit*/ priv_timer[tim_lcore].pending_head.expire = priv_timer[tim_lcore].\ pending_head.sl_next[0]->expire; @@ -87,7 +88,7 @@ } /* -@@ -379,8 +369,15 @@ __rte_timer_reset(struct rte_timer *tim, uint64_t expire, +@@ -408,8 +398,15 @@ __rte_timer_reset(struct rte_timer *tim, uint64_t expire, tim->f = fct; tim->arg = arg; @@ -104,7 +105,7 @@ /* update state: as we are in CONFIG state, only us can modify * the state so we don't need to use cmpset() here */ -@@ -389,6 +386,9 @@ __rte_timer_reset(struct rte_timer *tim, uint64_t expire, +@@ -418,6 +415,9 @@ __rte_timer_reset(struct rte_timer *tim, uint64_t expire, status.owner = (int16_t)tim_lcore; tim->status.u32 = status.u32;