DPDK patches and discussions
 help / color / mirror / Atom feed
From: Thomas Monjalon <thomas@monjalon.net>
To: dev@dpdk.org
Cc: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>,
	rsanford@akamai.com, olivier.matz@6wind.com,
	stephen@networkplumber.org, bruce.richardson@intel.com
Subject: Re: [dpdk-dev] [PATCH 1/1] timer: fix race condition
Date: Wed, 19 Dec 2018 04:36:53 +0100	[thread overview]
Message-ID: <3302607.WHYlCW4uPu@xps> (raw)
In-Reply-To: <1543517626-142526-1-git-send-email-erik.g.carrillo@intel.com>

Who could review this fix please?

29/11/2018 19:53, Erik Gabriel Carrillo:
> rte_timer_manage() adds expired timers to a "run list", and walks the
> list, transitioning each timer from the PENDING to the RUNNING state.
> If another lcore resets or stops the timer at precisely this
> moment, the timer state would instead be set to CONFIG by that other
> lcore, which would cause timer_manage() to skip over it. This is
> expected behavior.
> 
> However, if a timer expires quickly enough, there exists the
> following race condition that causes the timer_manage() routine to
> misinterpret a timer in CONFIG state, resulting in lost timers:
> 
> - Thread A:
>   - starts a timer with rte_timer_reset()
>   - the timer is moved to CONFIG state
>   - the spinlock associated with the appropriate skiplist is acquired
>   - timer is inserted into the skiplist
>   - the spinlock is released
> - Thread B:
>   - executes rte_timer_manage()
>   - find above timer as expired, add it to run list
>   - walk run list, see above timer still in CONFIG state, unlink it from
>     run list and continue on
> - Thread A:
>   - move timer to PENDING state
>   - return from rte_timer_reset()
>   - timer is now in PENDING state, but not actually linked into skiplist
>     and will never get processed further by rte_timer_manage()
> 
> This commit fixes this race condition by only releasing the spinlock
> after the timer state has been transitioned from CONFIG to PENDING,
> which prevents rte_timer_manage() from seeing an incorrect state.
> 
> Fixes: 9b15ba895b9f ("timer: use a skip list")
> Signed-off-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
> ---
>  lib/librte_timer/rte_timer.c | 28 ++++++++++++++--------------
>  1 file changed, 14 insertions(+), 14 deletions(-)
> 
> diff --git a/lib/librte_timer/rte_timer.c b/lib/librte_timer/rte_timer.c
> index 590488c..30c7b0a 100644
> --- a/lib/librte_timer/rte_timer.c
> +++ b/lib/librte_timer/rte_timer.c
> @@ -241,24 +241,17 @@ timer_get_prev_entries_for_node(struct rte_timer *tim, unsigned tim_lcore,
>  	}
>  }
>  
> -/*
> - * add in list, lock if needed
> +/* call with lock held as necessary
> + * add in list
>   * timer must be in config state
>   * timer must not be in a list
>   */
>  static void
> -timer_add(struct rte_timer *tim, unsigned tim_lcore, int local_is_locked)
> +timer_add(struct rte_timer *tim, unsigned int tim_lcore)
>  {
> -	unsigned lcore_id = rte_lcore_id();
>  	unsigned lvl;
>  	struct rte_timer *prev[MAX_SKIPLIST_DEPTH+1];
>  
> -	/* if timer needs to be scheduled on another core, we need to
> -	 * lock the list; if it is on local core, we need to lock if
> -	 * we are not called from rte_timer_manage() */
> -	if (tim_lcore != lcore_id || !local_is_locked)
> -		rte_spinlock_lock(&priv_timer[tim_lcore].list_lock);
> -
>  	/* find where exactly this element goes in the list of elements
>  	 * for each depth. */
>  	timer_get_prev_entries(tim->expire, tim_lcore, prev);
> @@ -282,9 +275,6 @@ timer_add(struct rte_timer *tim, unsigned tim_lcore, int local_is_locked)
>  	 * NOTE: this is not atomic on 32-bit*/
>  	priv_timer[tim_lcore].pending_head.expire = priv_timer[tim_lcore].\
>  			pending_head.sl_next[0]->expire;
> -
> -	if (tim_lcore != lcore_id || !local_is_locked)
> -		rte_spinlock_unlock(&priv_timer[tim_lcore].list_lock);
>  }
>  
>  /*
> @@ -379,8 +369,15 @@ __rte_timer_reset(struct rte_timer *tim, uint64_t expire,
>  	tim->f = fct;
>  	tim->arg = arg;
>  
> +	/* if timer needs to be scheduled on another core, we need to
> +	 * lock the destination list; if it is on local core, we need to lock if
> +	 * we are not called from rte_timer_manage()
> +	 */
> +	if (tim_lcore != lcore_id || !local_is_locked)
> +		rte_spinlock_lock(&priv_timer[tim_lcore].list_lock);
> +
>  	__TIMER_STAT_ADD(pending, 1);
> -	timer_add(tim, tim_lcore, local_is_locked);
> +	timer_add(tim, tim_lcore);
>  
>  	/* update state: as we are in CONFIG state, only us can modify
>  	 * the state so we don't need to use cmpset() here */
> @@ -389,6 +386,9 @@ __rte_timer_reset(struct rte_timer *tim, uint64_t expire,
>  	status.owner = (int16_t)tim_lcore;
>  	tim->status.u32 = status.u32;
>  
> +	if (tim_lcore != lcore_id || !local_is_locked)
> +		rte_spinlock_unlock(&priv_timer[tim_lcore].list_lock);
> +
>  	return 0;
>  }
>  
> 

  reply	other threads:[~2018-12-19  3:36 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-29 18:53 Erik Gabriel Carrillo
2018-12-19  3:36 ` Thomas Monjalon [this message]
2018-12-19  7:57   ` Gavin Hu (Arm Technology China)
2018-12-19 16:11     ` Carrillo, Erik G
2018-12-19 16:09 ` [dpdk-dev] [PATCH v2] " Erik Gabriel Carrillo
2018-12-19 18:54   ` Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3302607.WHYlCW4uPu@xps \
    --to=thomas@monjalon.net \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=erik.g.carrillo@intel.com \
    --cc=olivier.matz@6wind.com \
    --cc=rsanford@akamai.com \
    --cc=stephen@networkplumber.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).