DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH 1/3] eventdev: fix race condition on timer list counter
@ 2020-06-12 11:19 Phil Yang
  2020-06-12 11:19 ` [dpdk-dev] [PATCH 2/3] eventdev: use c11 atomics for lcore timer armed flag Phil Yang
                   ` (3 more replies)
  0 siblings, 4 replies; 45+ messages in thread
From: Phil Yang @ 2020-06-12 11:19 UTC (permalink / raw)
  To: dev, erik.g.carrillo
  Cc: drc, honnappa.nagarahalli, ruifeng.wang, dharmik.thakkar, nd, stable

The n_poll_lcores counter and poll_lcore array are shared between lcores
and the update of these variables are out of the protection of spinlock
on each lcore timer list. The read-modify-write operations of the counter
are not atomic, so it has the potential of race condition between lcores.

Use c11 atomics with RELAXED ordering to prevent confliction.

Fixes: cc7b73ea9e3b ("eventdev: add new software timer adapter")
Cc: erik.g.carrillo@intel.com
Cc: stable@dpdk.org

Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
 lib/librte_eventdev/rte_event_timer_adapter.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/lib/librte_eventdev/rte_event_timer_adapter.c b/lib/librte_eventdev/rte_event_timer_adapter.c
index 005459f..6a0e283 100644
--- a/lib/librte_eventdev/rte_event_timer_adapter.c
+++ b/lib/librte_eventdev/rte_event_timer_adapter.c
@@ -583,6 +583,7 @@ swtim_callback(struct rte_timer *tim)
 	uint16_t nb_evs_invalid = 0;
 	uint64_t opaque;
 	int ret;
+	int n_lcores;
 
 	opaque = evtim->impl_opaque[1];
 	adapter = (struct rte_event_timer_adapter *)(uintptr_t)opaque;
@@ -605,8 +606,12 @@ swtim_callback(struct rte_timer *tim)
 				      "with immediate expiry value");
 		}
 
-		if (unlikely(rte_atomic16_test_and_set(&sw->in_use[lcore].v)))
-			sw->poll_lcores[sw->n_poll_lcores++] = lcore;
+		if (unlikely(rte_atomic16_test_and_set(&sw->in_use[lcore].v))) {
+			n_lcores = __atomic_fetch_add(&sw->n_poll_lcores, 1,
+						__ATOMIC_RELAXED);
+			__atomic_store_n(&sw->poll_lcores[n_lcores], lcore,
+						__ATOMIC_RELAXED);
+		}
 	} else {
 		EVTIM_BUF_LOG_DBG("buffered an event timer expiry event");
 
@@ -1011,6 +1016,7 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
 	uint32_t lcore_id = rte_lcore_id();
 	struct rte_timer *tim, *tims[nb_evtims];
 	uint64_t cycles;
+	int n_lcores;
 
 #ifdef RTE_LIBRTE_EVENTDEV_DEBUG
 	/* Check that the service is running. */
@@ -1033,8 +1039,10 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
 	if (unlikely(rte_atomic16_test_and_set(&sw->in_use[lcore_id].v))) {
 		EVTIM_LOG_DBG("Adding lcore id = %u to list of lcores to poll",
 			      lcore_id);
-		sw->poll_lcores[sw->n_poll_lcores] = lcore_id;
-		++sw->n_poll_lcores;
+		n_lcores = __atomic_fetch_add(&sw->n_poll_lcores, 1,
+						__ATOMIC_RELAXED);
+		__atomic_store_n(&sw->poll_lcores[n_lcores], lcore_id,
+						__ATOMIC_RELAXED);
 	}
 
 	ret = rte_mempool_get_bulk(sw->tim_pool, (void **)tims,
-- 
2.7.4


^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH 2/3] eventdev: use c11 atomics for lcore timer armed flag
  2020-06-12 11:19 [dpdk-dev] [PATCH 1/3] eventdev: fix race condition on timer list counter Phil Yang
@ 2020-06-12 11:19 ` Phil Yang
  2020-06-23 21:01   ` Carrillo, Erik G
  2020-06-23 21:20   ` Stephen Hemminger
  2020-06-12 11:19 ` [dpdk-dev] [PATCH 3/3] eventdev: relax smp barriers with c11 atomics Phil Yang
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 45+ messages in thread
From: Phil Yang @ 2020-06-12 11:19 UTC (permalink / raw)
  To: dev, erik.g.carrillo
  Cc: drc, honnappa.nagarahalli, ruifeng.wang, dharmik.thakkar, nd

The in_use flag is a per core variable which is not shared between
lcores in the normal case and the access of this variable should be
ordered on the same core. However, if non-EAL thread pick the highest
lcore to insert timers into, there is the possibility of conflicts
on this flag between threads. Then the atomic CAS operation is needed.

Use the c11 atomic CAS instead of the generic rte_atomic operations
to avoid the unnecessary barrier on aarch64.

Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
 lib/librte_eventdev/rte_event_timer_adapter.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/lib/librte_eventdev/rte_event_timer_adapter.c b/lib/librte_eventdev/rte_event_timer_adapter.c
index 6a0e283..6947efb 100644
--- a/lib/librte_eventdev/rte_event_timer_adapter.c
+++ b/lib/librte_eventdev/rte_event_timer_adapter.c
@@ -554,7 +554,7 @@ struct swtim {
 	uint32_t timer_data_id;
 	/* Track which cores have actually armed a timer */
 	struct {
-		rte_atomic16_t v;
+		int16_t v;
 	} __rte_cache_aligned in_use[RTE_MAX_LCORE];
 	/* Track which cores' timer lists should be polled */
 	unsigned int poll_lcores[RTE_MAX_LCORE];
@@ -606,7 +606,8 @@ swtim_callback(struct rte_timer *tim)
 				      "with immediate expiry value");
 		}
 
-		if (unlikely(rte_atomic16_test_and_set(&sw->in_use[lcore].v))) {
+		if (unlikely(sw->in_use[lcore].v == 0)) {
+			sw->in_use[lcore].v = 1;
 			n_lcores = __atomic_fetch_add(&sw->n_poll_lcores, 1,
 						__ATOMIC_RELAXED);
 			__atomic_store_n(&sw->poll_lcores[n_lcores], lcore,
@@ -834,7 +835,7 @@ swtim_init(struct rte_event_timer_adapter *adapter)
 
 	/* Initialize the variables that track in-use timer lists */
 	for (i = 0; i < RTE_MAX_LCORE; i++)
-		rte_atomic16_init(&sw->in_use[i].v);
+		sw->in_use[i].v = 0;
 
 	/* Initialize the timer subsystem and allocate timer data instance */
 	ret = rte_timer_subsystem_init();
@@ -1017,6 +1018,8 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
 	struct rte_timer *tim, *tims[nb_evtims];
 	uint64_t cycles;
 	int n_lcores;
+	/* Timer is not armed state */
+	int16_t exp_state = 0;
 
 #ifdef RTE_LIBRTE_EVENTDEV_DEBUG
 	/* Check that the service is running. */
@@ -1035,8 +1038,12 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
 	/* If this is the first time we're arming an event timer on this lcore,
 	 * mark this lcore as "in use"; this will cause the service
 	 * function to process the timer list that corresponds to this lcore.
+	 * The atomic CAS operation can prevent the race condition on in_use
+	 * flag between multiple non-EAL threads.
 	 */
-	if (unlikely(rte_atomic16_test_and_set(&sw->in_use[lcore_id].v))) {
+	if (unlikely(__atomic_compare_exchange_n(&sw->in_use[lcore_id].v,
+			&exp_state, 1, 0,
+			__ATOMIC_RELAXED, __ATOMIC_RELAXED))) {
 		EVTIM_LOG_DBG("Adding lcore id = %u to list of lcores to poll",
 			      lcore_id);
 		n_lcores = __atomic_fetch_add(&sw->n_poll_lcores, 1,
-- 
2.7.4


^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH 3/3] eventdev: relax smp barriers with c11 atomics
  2020-06-12 11:19 [dpdk-dev] [PATCH 1/3] eventdev: fix race condition on timer list counter Phil Yang
  2020-06-12 11:19 ` [dpdk-dev] [PATCH 2/3] eventdev: use c11 atomics for lcore timer armed flag Phil Yang
@ 2020-06-12 11:19 ` Phil Yang
  2020-06-22 10:12   ` Phil Yang
  2020-06-18 15:17 ` [dpdk-dev] [PATCH 1/3] eventdev: fix race condition on timer list counter Carrillo, Erik G
  2020-07-02  5:26 ` [dpdk-dev] [PATCH v2 1/4] " Phil Yang
  3 siblings, 1 reply; 45+ messages in thread
From: Phil Yang @ 2020-06-12 11:19 UTC (permalink / raw)
  To: dev, erik.g.carrillo
  Cc: drc, honnappa.nagarahalli, ruifeng.wang, dharmik.thakkar, nd

The implementation-specific opaque data is shared between arm and cancel
operations. The state flag acts as a guard variable to make sure the
update of opaque data is synchronized. This patch uses c11 atomics with
explicit one way memory barrier instead of full barriers rte_smp_w/rmb()
to synchronize the opaque data between timer arm and cancel threads.

Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
 lib/librte_eventdev/rte_event_timer_adapter.c | 55 ++++++++++++++++++---------
 lib/librte_eventdev/rte_event_timer_adapter.h |  2 +-
 2 files changed, 38 insertions(+), 19 deletions(-)

diff --git a/lib/librte_eventdev/rte_event_timer_adapter.c b/lib/librte_eventdev/rte_event_timer_adapter.c
index 6947efb..0a26501 100644
--- a/lib/librte_eventdev/rte_event_timer_adapter.c
+++ b/lib/librte_eventdev/rte_event_timer_adapter.c
@@ -629,7 +629,8 @@ swtim_callback(struct rte_timer *tim)
 		sw->expired_timers[sw->n_expired_timers++] = tim;
 		sw->stats.evtim_exp_count++;
 
-		evtim->state = RTE_EVENT_TIMER_NOT_ARMED;
+		__atomic_store_n(&evtim->state, RTE_EVENT_TIMER_NOT_ARMED,
+				 __ATOMIC_RELEASE);
 	}
 
 	if (event_buffer_batch_ready(&sw->buffer)) {
@@ -1020,6 +1021,7 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
 	int n_lcores;
 	/* Timer is not armed state */
 	int16_t exp_state = 0;
+	enum rte_event_timer_state n_state;
 
 #ifdef RTE_LIBRTE_EVENTDEV_DEBUG
 	/* Check that the service is running. */
@@ -1060,30 +1062,36 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
 	}
 
 	for (i = 0; i < nb_evtims; i++) {
-		/* Don't modify the event timer state in these cases */
-		if (evtims[i]->state == RTE_EVENT_TIMER_ARMED) {
+		n_state = __atomic_load_n(&evtims[i]->state, __ATOMIC_RELAXED);
+		if (n_state == RTE_EVENT_TIMER_ARMED) {
 			rte_errno = EALREADY;
 			break;
-		} else if (!(evtims[i]->state == RTE_EVENT_TIMER_NOT_ARMED ||
-			     evtims[i]->state == RTE_EVENT_TIMER_CANCELED)) {
+		} else if (!(n_state == RTE_EVENT_TIMER_NOT_ARMED ||
+			     n_state == RTE_EVENT_TIMER_CANCELED)) {
 			rte_errno = EINVAL;
 			break;
 		}
 
 		ret = check_timeout(evtims[i], adapter);
 		if (unlikely(ret == -1)) {
-			evtims[i]->state = RTE_EVENT_TIMER_ERROR_TOOLATE;
+			__atomic_store_n(&evtims[i]->state,
+					RTE_EVENT_TIMER_ERROR_TOOLATE,
+					__ATOMIC_RELAXED);
 			rte_errno = EINVAL;
 			break;
 		} else if (unlikely(ret == -2)) {
-			evtims[i]->state = RTE_EVENT_TIMER_ERROR_TOOEARLY;
+			__atomic_store_n(&evtims[i]->state,
+					RTE_EVENT_TIMER_ERROR_TOOEARLY,
+					__ATOMIC_RELAXED);
 			rte_errno = EINVAL;
 			break;
 		}
 
 		if (unlikely(check_destination_event_queue(evtims[i],
 							   adapter) < 0)) {
-			evtims[i]->state = RTE_EVENT_TIMER_ERROR;
+			__atomic_store_n(&evtims[i]->state,
+					RTE_EVENT_TIMER_ERROR,
+					__ATOMIC_RELAXED);
 			rte_errno = EINVAL;
 			break;
 		}
@@ -1099,13 +1107,18 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
 					  SINGLE, lcore_id, NULL, evtims[i]);
 		if (ret < 0) {
 			/* tim was in RUNNING or CONFIG state */
-			evtims[i]->state = RTE_EVENT_TIMER_ERROR;
+			__atomic_store_n(&evtims[i]->state,
+					RTE_EVENT_TIMER_ERROR,
+					__ATOMIC_RELEASE);
 			break;
 		}
 
-		rte_smp_wmb();
 		EVTIM_LOG_DBG("armed an event timer");
-		evtims[i]->state = RTE_EVENT_TIMER_ARMED;
+		/* RELEASE ordering guarantees the adapter specific value
+		 * changes observed before the update of state.
+		 */
+		__atomic_store_n(&evtims[i]->state, RTE_EVENT_TIMER_ARMED,
+				__ATOMIC_RELEASE);
 	}
 
 	if (i < nb_evtims)
@@ -1132,6 +1145,7 @@ swtim_cancel_burst(const struct rte_event_timer_adapter *adapter,
 	struct rte_timer *timp;
 	uint64_t opaque;
 	struct swtim *sw = swtim_pmd_priv(adapter);
+	enum rte_event_timer_state n_state;
 
 #ifdef RTE_LIBRTE_EVENTDEV_DEBUG
 	/* Check that the service is running. */
@@ -1143,16 +1157,18 @@ swtim_cancel_burst(const struct rte_event_timer_adapter *adapter,
 
 	for (i = 0; i < nb_evtims; i++) {
 		/* Don't modify the event timer state in these cases */
-		if (evtims[i]->state == RTE_EVENT_TIMER_CANCELED) {
+		/* ACQUIRE ordering guarantees the access of implementation
+		 * specific opague data under the correct state.
+		 */
+		n_state = __atomic_load_n(&evtims[i]->state, __ATOMIC_ACQUIRE);
+		if (n_state == RTE_EVENT_TIMER_CANCELED) {
 			rte_errno = EALREADY;
 			break;
-		} else if (evtims[i]->state != RTE_EVENT_TIMER_ARMED) {
+		} else if (n_state != RTE_EVENT_TIMER_ARMED) {
 			rte_errno = EINVAL;
 			break;
 		}
 
-		rte_smp_rmb();
-
 		opaque = evtims[i]->impl_opaque[0];
 		timp = (struct rte_timer *)(uintptr_t)opaque;
 		RTE_ASSERT(timp != NULL);
@@ -1166,11 +1182,14 @@ swtim_cancel_burst(const struct rte_event_timer_adapter *adapter,
 
 		rte_mempool_put(sw->tim_pool, (void **)timp);
 
-		evtims[i]->state = RTE_EVENT_TIMER_CANCELED;
+		__atomic_store_n(&evtims[i]->state, RTE_EVENT_TIMER_CANCELED,
+				__ATOMIC_RELAXED);
 		evtims[i]->impl_opaque[0] = 0;
 		evtims[i]->impl_opaque[1] = 0;
-
-		rte_smp_wmb();
+		/* The RELEASE fence make sure the clean up
+		 * of opaque data observed between threads.
+		 */
+		__atomic_thread_fence(__ATOMIC_RELEASE);
 	}
 
 	return i;
diff --git a/lib/librte_eventdev/rte_event_timer_adapter.h b/lib/librte_eventdev/rte_event_timer_adapter.h
index d2ebcb0..6f64b90 100644
--- a/lib/librte_eventdev/rte_event_timer_adapter.h
+++ b/lib/librte_eventdev/rte_event_timer_adapter.h
@@ -467,7 +467,7 @@ struct rte_event_timer {
 	 *  - op: RTE_EVENT_OP_NEW
 	 *  - event_type: RTE_EVENT_TYPE_TIMER
 	 */
-	volatile enum rte_event_timer_state state;
+	enum rte_event_timer_state state;
 	/**< State of the event timer. */
 	uint64_t timeout_ticks;
 	/**< Expiry timer ticks expressed in number of *timer_ticks_ns* from
-- 
2.7.4


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH 1/3] eventdev: fix race condition on timer list counter
  2020-06-12 11:19 [dpdk-dev] [PATCH 1/3] eventdev: fix race condition on timer list counter Phil Yang
  2020-06-12 11:19 ` [dpdk-dev] [PATCH 2/3] eventdev: use c11 atomics for lcore timer armed flag Phil Yang
  2020-06-12 11:19 ` [dpdk-dev] [PATCH 3/3] eventdev: relax smp barriers with c11 atomics Phil Yang
@ 2020-06-18 15:17 ` Carrillo, Erik G
  2020-06-18 18:25   ` Honnappa Nagarahalli
  2020-06-22  9:09   ` Phil Yang
  2020-07-02  5:26 ` [dpdk-dev] [PATCH v2 1/4] " Phil Yang
  3 siblings, 2 replies; 45+ messages in thread
From: Carrillo, Erik G @ 2020-06-18 15:17 UTC (permalink / raw)
  To: Phil Yang, dev
  Cc: drc, honnappa.nagarahalli, ruifeng.wang, dharmik.thakkar, nd, stable

Hi Phil,

Good catch - thanks for the fix.   I've commented in-line:

> -----Original Message-----
> From: Phil Yang <phil.yang@arm.com>
> Sent: Friday, June 12, 2020 6:20 AM
> To: dev@dpdk.org; Carrillo, Erik G <erik.g.carrillo@intel.com>
> Cc: drc@linux.vnet.ibm.com; honnappa.nagarahalli@arm.com;
> ruifeng.wang@arm.com; dharmik.thakkar@arm.com; nd@arm.com;
> stable@dpdk.org
> Subject: [PATCH 1/3] eventdev: fix race condition on timer list counter
> 
> The n_poll_lcores counter and poll_lcore array are shared between lcores
> and the update of these variables are out of the protection of spinlock on
> each lcore timer list. The read-modify-write operations of the counter are
> not atomic, so it has the potential of race condition between lcores.
> 
> Use c11 atomics with RELAXED ordering to prevent confliction.
> 
> Fixes: cc7b73ea9e3b ("eventdev: add new software timer adapter")
> Cc: erik.g.carrillo@intel.com
> Cc: stable@dpdk.org
> 
> Signed-off-by: Phil Yang <phil.yang@arm.com>
> Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> ---
>  lib/librte_eventdev/rte_event_timer_adapter.c | 16 ++++++++++++----
>  1 file changed, 12 insertions(+), 4 deletions(-)
> 
> diff --git a/lib/librte_eventdev/rte_event_timer_adapter.c
> b/lib/librte_eventdev/rte_event_timer_adapter.c
> index 005459f..6a0e283 100644
> --- a/lib/librte_eventdev/rte_event_timer_adapter.c
> +++ b/lib/librte_eventdev/rte_event_timer_adapter.c
> @@ -583,6 +583,7 @@ swtim_callback(struct rte_timer *tim)
>  	uint16_t nb_evs_invalid = 0;
>  	uint64_t opaque;
>  	int ret;
> +	int n_lcores;
> 
>  	opaque = evtim->impl_opaque[1];
>  	adapter = (struct rte_event_timer_adapter *)(uintptr_t)opaque;
> @@ -605,8 +606,12 @@ swtim_callback(struct rte_timer *tim)
>  				      "with immediate expiry value");
>  		}
> 
> -		if (unlikely(rte_atomic16_test_and_set(&sw-
> >in_use[lcore].v)))
> -			sw->poll_lcores[sw->n_poll_lcores++] = lcore;
> +		if (unlikely(rte_atomic16_test_and_set(&sw-
> >in_use[lcore].v))) {
> +			n_lcores = __atomic_fetch_add(&sw->n_poll_lcores,
> 1,
> +						__ATOMIC_RELAXED);

Just a nit, but let's align the continued line with the opening parentheses in
this location and below.  With these changes:

Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>

> +			__atomic_store_n(&sw->poll_lcores[n_lcores],
> lcore,
> +						__ATOMIC_RELAXED);
> +		}
>  	} else {
>  		EVTIM_BUF_LOG_DBG("buffered an event timer expiry
> event");
> 
> @@ -1011,6 +1016,7 @@ __swtim_arm_burst(const struct
> rte_event_timer_adapter *adapter,
>  	uint32_t lcore_id = rte_lcore_id();
>  	struct rte_timer *tim, *tims[nb_evtims];
>  	uint64_t cycles;
> +	int n_lcores;
> 
>  #ifdef RTE_LIBRTE_EVENTDEV_DEBUG
>  	/* Check that the service is running. */ @@ -1033,8 +1039,10 @@
> __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
>  	if (unlikely(rte_atomic16_test_and_set(&sw->in_use[lcore_id].v))) {
>  		EVTIM_LOG_DBG("Adding lcore id = %u to list of lcores to
> poll",
>  			      lcore_id);
> -		sw->poll_lcores[sw->n_poll_lcores] = lcore_id;
> -		++sw->n_poll_lcores;
> +		n_lcores = __atomic_fetch_add(&sw->n_poll_lcores, 1,
> +						__ATOMIC_RELAXED);
> +		__atomic_store_n(&sw->poll_lcores[n_lcores], lcore_id,
> +						__ATOMIC_RELAXED);
>  	}
> 
>  	ret = rte_mempool_get_bulk(sw->tim_pool, (void **)tims,
> --
> 2.7.4


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH 1/3] eventdev: fix race condition on timer list counter
  2020-06-18 15:17 ` [dpdk-dev] [PATCH 1/3] eventdev: fix race condition on timer list counter Carrillo, Erik G
@ 2020-06-18 18:25   ` Honnappa Nagarahalli
  2020-06-22  9:48     ` Phil Yang
  2020-07-02  3:26     ` Phil Yang
  2020-06-22  9:09   ` Phil Yang
  1 sibling, 2 replies; 45+ messages in thread
From: Honnappa Nagarahalli @ 2020-06-18 18:25 UTC (permalink / raw)
  To: Carrillo, Erik G, Phil Yang, dev
  Cc: drc, Ruifeng Wang, Dharmik Thakkar, nd, stable, Honnappa Nagarahalli, nd

<snip>

> 
> Hi Phil,
> 
> Good catch - thanks for the fix.   I've commented in-line:
> 
> > -----Original Message-----
> > From: Phil Yang <phil.yang@arm.com>
> > Sent: Friday, June 12, 2020 6:20 AM
> > To: dev@dpdk.org; Carrillo, Erik G <erik.g.carrillo@intel.com>
> > Cc: drc@linux.vnet.ibm.com; honnappa.nagarahalli@arm.com;
> > ruifeng.wang@arm.com; dharmik.thakkar@arm.com; nd@arm.com;
> > stable@dpdk.org
> > Subject: [PATCH 1/3] eventdev: fix race condition on timer list
> > counter
> >
> > The n_poll_lcores counter and poll_lcore array are shared between
> > lcores and the update of these variables are out of the protection of
> > spinlock on each lcore timer list. The read-modify-write operations of
> > the counter are not atomic, so it has the potential of race condition
> between lcores.
> >
> > Use c11 atomics with RELAXED ordering to prevent confliction.
> >
> > Fixes: cc7b73ea9e3b ("eventdev: add new software timer adapter")
> > Cc: erik.g.carrillo@intel.com
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Phil Yang <phil.yang@arm.com>
> > Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
> > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > ---
> >  lib/librte_eventdev/rte_event_timer_adapter.c | 16 ++++++++++++----
> >  1 file changed, 12 insertions(+), 4 deletions(-)
> >
> > diff --git a/lib/librte_eventdev/rte_event_timer_adapter.c
> > b/lib/librte_eventdev/rte_event_timer_adapter.c
> > index 005459f..6a0e283 100644
> > --- a/lib/librte_eventdev/rte_event_timer_adapter.c
> > +++ b/lib/librte_eventdev/rte_event_timer_adapter.c
> > @@ -583,6 +583,7 @@ swtim_callback(struct rte_timer *tim)
> >  	uint16_t nb_evs_invalid = 0;
> >  	uint64_t opaque;
> >  	int ret;
> > +	int n_lcores;
> >
> >  	opaque = evtim->impl_opaque[1];
> >  	adapter = (struct rte_event_timer_adapter *)(uintptr_t)opaque; @@
> > -605,8 +606,12 @@ swtim_callback(struct rte_timer *tim)
> >  				      "with immediate expiry value");
> >  		}
> >
> > -		if (unlikely(rte_atomic16_test_and_set(&sw-
> > >in_use[lcore].v)))
> > -			sw->poll_lcores[sw->n_poll_lcores++] = lcore;
> > +		if (unlikely(rte_atomic16_test_and_set(&sw-
> > >in_use[lcore].v))) {
> > +			n_lcores = __atomic_fetch_add(&sw->n_poll_lcores,
> > 1,
> > +						__ATOMIC_RELAXED);
Since this commit will be back ported, we should prefer to use rte_atomic APIs for this commit. Otherwise, we will have a mix of rte_atomic and C11 APIs.
My suggestion is to fix this bug using rte_atomic so that backported code will have only rte_atomic APIs. Add another commit (if required) in this series to make the bug fix use C11 APIs (this commit will not be backported).

> 
> Just a nit, but let's align the continued line with the opening parentheses in
> this location and below.  With these changes:
> 
> Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
> 
> > +			__atomic_store_n(&sw->poll_lcores[n_lcores],
> > lcore,
> > +						__ATOMIC_RELAXED);
> > +		}
> >  	} else {
> >  		EVTIM_BUF_LOG_DBG("buffered an event timer expiry event");
> >
> > @@ -1011,6 +1016,7 @@ __swtim_arm_burst(const struct
> > rte_event_timer_adapter *adapter,
> >  	uint32_t lcore_id = rte_lcore_id();
> >  	struct rte_timer *tim, *tims[nb_evtims];
> >  	uint64_t cycles;
> > +	int n_lcores;
> >
> >  #ifdef RTE_LIBRTE_EVENTDEV_DEBUG
> >  	/* Check that the service is running. */ @@ -1033,8 +1039,10 @@
> > __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
> >  	if (unlikely(rte_atomic16_test_and_set(&sw->in_use[lcore_id].v))) {
> >  		EVTIM_LOG_DBG("Adding lcore id = %u to list of lcores to
> poll",
> >  			      lcore_id);
> > -		sw->poll_lcores[sw->n_poll_lcores] = lcore_id;
> > -		++sw->n_poll_lcores;
> > +		n_lcores = __atomic_fetch_add(&sw->n_poll_lcores, 1,
> > +						__ATOMIC_RELAXED);
> > +		__atomic_store_n(&sw->poll_lcores[n_lcores], lcore_id,
> > +						__ATOMIC_RELAXED);
> >  	}
> >
> >  	ret = rte_mempool_get_bulk(sw->tim_pool, (void **)tims,
> > --
> > 2.7.4


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH 1/3] eventdev: fix race condition on timer list counter
  2020-06-18 15:17 ` [dpdk-dev] [PATCH 1/3] eventdev: fix race condition on timer list counter Carrillo, Erik G
  2020-06-18 18:25   ` Honnappa Nagarahalli
@ 2020-06-22  9:09   ` Phil Yang
  1 sibling, 0 replies; 45+ messages in thread
From: Phil Yang @ 2020-06-22  9:09 UTC (permalink / raw)
  To: Carrillo, Erik G, dev
  Cc: drc, Honnappa Nagarahalli, Ruifeng Wang, Dharmik Thakkar, nd, stable

> -----Original Message-----
> From: Carrillo, Erik G <erik.g.carrillo@intel.com>
> Sent: Thursday, June 18, 2020 11:18 PM
> To: Phil Yang <Phil.Yang@arm.com>; dev@dpdk.org
> Cc: drc@linux.vnet.ibm.com; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Ruifeng Wang
> <Ruifeng.Wang@arm.com>; Dharmik Thakkar <Dharmik.Thakkar@arm.com>;
> nd <nd@arm.com>; stable@dpdk.org
> Subject: RE: [PATCH 1/3] eventdev: fix race condition on timer list counter
> 
> Hi Phil,
> 
> Good catch - thanks for the fix.   I've commented in-line:
> 
> > -----Original Message-----
> > From: Phil Yang <phil.yang@arm.com>
> > Sent: Friday, June 12, 2020 6:20 AM
> > To: dev@dpdk.org; Carrillo, Erik G <erik.g.carrillo@intel.com>
> > Cc: drc@linux.vnet.ibm.com; honnappa.nagarahalli@arm.com;
> > ruifeng.wang@arm.com; dharmik.thakkar@arm.com; nd@arm.com;
> > stable@dpdk.org
> > Subject: [PATCH 1/3] eventdev: fix race condition on timer list counter
> >
> > The n_poll_lcores counter and poll_lcore array are shared between lcores
> > and the update of these variables are out of the protection of spinlock on
> > each lcore timer list. The read-modify-write operations of the counter are
> > not atomic, so it has the potential of race condition between lcores.
> >
> > Use c11 atomics with RELAXED ordering to prevent confliction.
> >
> > Fixes: cc7b73ea9e3b ("eventdev: add new software timer adapter")
> > Cc: erik.g.carrillo@intel.com
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Phil Yang <phil.yang@arm.com>
> > Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
> > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > ---
> >  lib/librte_eventdev/rte_event_timer_adapter.c | 16 ++++++++++++----
> >  1 file changed, 12 insertions(+), 4 deletions(-)
> >
> > diff --git a/lib/librte_eventdev/rte_event_timer_adapter.c
> > b/lib/librte_eventdev/rte_event_timer_adapter.c
> > index 005459f..6a0e283 100644
> > --- a/lib/librte_eventdev/rte_event_timer_adapter.c
> > +++ b/lib/librte_eventdev/rte_event_timer_adapter.c
> > @@ -583,6 +583,7 @@ swtim_callback(struct rte_timer *tim)
> >  	uint16_t nb_evs_invalid = 0;
> >  	uint64_t opaque;
> >  	int ret;
> > +	int n_lcores;
> >
> >  	opaque = evtim->impl_opaque[1];
> >  	adapter = (struct rte_event_timer_adapter *)(uintptr_t)opaque;
> > @@ -605,8 +606,12 @@ swtim_callback(struct rte_timer *tim)
> >  				      "with immediate expiry value");
> >  		}
> >
> > -		if (unlikely(rte_atomic16_test_and_set(&sw-
> > >in_use[lcore].v)))
> > -			sw->poll_lcores[sw->n_poll_lcores++] = lcore;
> > +		if (unlikely(rte_atomic16_test_and_set(&sw-
> > >in_use[lcore].v))) {
> > +			n_lcores = __atomic_fetch_add(&sw->n_poll_lcores,
> > 1,
> > +						__ATOMIC_RELAXED);
> 
> Just a nit, but let's align the continued line with the opening parentheses in
> this location and below.  With these changes:

Thanks Erik. 
I will do it in the new version.

> 
> Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
> 
> > +			__atomic_store_n(&sw->poll_lcores[n_lcores],
> > lcore,
> > +						__ATOMIC_RELAXED);
> > +		}
> >  	} else {
> >  		EVTIM_BUF_LOG_DBG("buffered an event timer expiry
> > event");
> >
> > @@ -1011,6 +1016,7 @@ __swtim_arm_burst(const struct
> > rte_event_timer_adapter *adapter,
> >  	uint32_t lcore_id = rte_lcore_id();
> >  	struct rte_timer *tim, *tims[nb_evtims];
> >  	uint64_t cycles;
> > +	int n_lcores;
> >
> >  #ifdef RTE_LIBRTE_EVENTDEV_DEBUG
> >  	/* Check that the service is running. */ @@ -1033,8 +1039,10 @@
> > __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
> >  	if (unlikely(rte_atomic16_test_and_set(&sw->in_use[lcore_id].v))) {
> >  		EVTIM_LOG_DBG("Adding lcore id = %u to list of lcores to
> > poll",
> >  			      lcore_id);
> > -		sw->poll_lcores[sw->n_poll_lcores] = lcore_id;
> > -		++sw->n_poll_lcores;
> > +		n_lcores = __atomic_fetch_add(&sw->n_poll_lcores, 1,
> > +						__ATOMIC_RELAXED);
> > +		__atomic_store_n(&sw->poll_lcores[n_lcores], lcore_id,
> > +						__ATOMIC_RELAXED);
> >  	}
> >
> >  	ret = rte_mempool_get_bulk(sw->tim_pool, (void **)tims,
> > --
> > 2.7.4


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH 1/3] eventdev: fix race condition on timer list counter
  2020-06-18 18:25   ` Honnappa Nagarahalli
@ 2020-06-22  9:48     ` Phil Yang
  2020-07-01 11:22       ` Jerin Jacob
  2020-07-02  3:26     ` Phil Yang
  1 sibling, 1 reply; 45+ messages in thread
From: Phil Yang @ 2020-06-22  9:48 UTC (permalink / raw)
  To: Honnappa Nagarahalli, Carrillo, Erik G, dev
  Cc: drc, Ruifeng Wang, Dharmik Thakkar, nd, stable, nd

> -----Original Message-----
> From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> Sent: Friday, June 19, 2020 2:26 AM
> To: Carrillo, Erik G <erik.g.carrillo@intel.com>; Phil Yang
> <Phil.Yang@arm.com>; dev@dpdk.org
> Cc: drc@linux.vnet.ibm.com; Ruifeng Wang <Ruifeng.Wang@arm.com>;
> Dharmik Thakkar <Dharmik.Thakkar@arm.com>; nd <nd@arm.com>;
> stable@dpdk.org; Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>;
> nd <nd@arm.com>
> Subject: RE: [PATCH 1/3] eventdev: fix race condition on timer list counter
> 
> <snip>
> 
> >
> > Hi Phil,
> >
> > Good catch - thanks for the fix.   I've commented in-line:
> >
> > > -----Original Message-----
> > > From: Phil Yang <phil.yang@arm.com>
> > > Sent: Friday, June 12, 2020 6:20 AM
> > > To: dev@dpdk.org; Carrillo, Erik G <erik.g.carrillo@intel.com>
> > > Cc: drc@linux.vnet.ibm.com; honnappa.nagarahalli@arm.com;
> > > ruifeng.wang@arm.com; dharmik.thakkar@arm.com; nd@arm.com;
> > > stable@dpdk.org
> > > Subject: [PATCH 1/3] eventdev: fix race condition on timer list
> > > counter
> > >
> > > The n_poll_lcores counter and poll_lcore array are shared between
> > > lcores and the update of these variables are out of the protection of
> > > spinlock on each lcore timer list. The read-modify-write operations of
> > > the counter are not atomic, so it has the potential of race condition
> > between lcores.
> > >
> > > Use c11 atomics with RELAXED ordering to prevent confliction.
> > >
> > > Fixes: cc7b73ea9e3b ("eventdev: add new software timer adapter")
> > > Cc: erik.g.carrillo@intel.com
> > > Cc: stable@dpdk.org
> > >
> > > Signed-off-by: Phil Yang <phil.yang@arm.com>
> > > Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
> > > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > > ---
> > >  lib/librte_eventdev/rte_event_timer_adapter.c | 16 ++++++++++++----
> > >  1 file changed, 12 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/lib/librte_eventdev/rte_event_timer_adapter.c
> > > b/lib/librte_eventdev/rte_event_timer_adapter.c
> > > index 005459f..6a0e283 100644
> > > --- a/lib/librte_eventdev/rte_event_timer_adapter.c
> > > +++ b/lib/librte_eventdev/rte_event_timer_adapter.c
> > > @@ -583,6 +583,7 @@ swtim_callback(struct rte_timer *tim)
> > >  uint16_t nb_evs_invalid = 0;
> > >  uint64_t opaque;
> > >  int ret;
> > > +int n_lcores;
> > >
> > >  opaque = evtim->impl_opaque[1];
> > >  adapter = (struct rte_event_timer_adapter *)(uintptr_t)opaque; @@
> > > -605,8 +606,12 @@ swtim_callback(struct rte_timer *tim)
> > >        "with immediate expiry value");
> > >  }
> > >
> > > -if (unlikely(rte_atomic16_test_and_set(&sw-
> > > >in_use[lcore].v)))
> > > -sw->poll_lcores[sw->n_poll_lcores++] = lcore;
> > > +if (unlikely(rte_atomic16_test_and_set(&sw-
> > > >in_use[lcore].v))) {
> > > +n_lcores = __atomic_fetch_add(&sw->n_poll_lcores,
> > > 1,
> > > +__ATOMIC_RELAXED);
> Since this commit will be back ported, we should prefer to use rte_atomic
> APIs for this commit. Otherwise, we will have a mix of rte_atomic and C11
> APIs.
> My suggestion is to fix this bug using rte_atomic so that backported code will
> have only rte_atomic APIs. Add another commit (if required) in this series to
> make the bug fix use C11 APIs (this commit will not be backported).

Agree. 
I will change this patch to the rte_atomic version in the next version.

Thanks,
Phil

> 
> >
> > Just a nit, but let's align the continued line with the opening parentheses in
> > this location and below.  With these changes:
> >
> > Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
> >
> > > +__atomic_store_n(&sw->poll_lcores[n_lcores],
> > > lcore,
> > > +__ATOMIC_RELAXED);
> > > +}
> > >  } else {
> > >  EVTIM_BUF_LOG_DBG("buffered an event timer expiry event");
> > >
> > > @@ -1011,6 +1016,7 @@ __swtim_arm_burst(const struct
> > > rte_event_timer_adapter *adapter,
> > >  uint32_t lcore_id = rte_lcore_id();
> > >  struct rte_timer *tim, *tims[nb_evtims];
> > >  uint64_t cycles;
> > > +int n_lcores;
> > >
> > >  #ifdef RTE_LIBRTE_EVENTDEV_DEBUG
> > >  /* Check that the service is running. */ @@ -1033,8 +1039,10 @@
> > > __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
> > >  if (unlikely(rte_atomic16_test_and_set(&sw->in_use[lcore_id].v))) {
> > >  EVTIM_LOG_DBG("Adding lcore id = %u to list of lcores to
> > poll",
> > >        lcore_id);
> > > -sw->poll_lcores[sw->n_poll_lcores] = lcore_id;
> > > -++sw->n_poll_lcores;
> > > +n_lcores = __atomic_fetch_add(&sw->n_poll_lcores, 1,
> > > +__ATOMIC_RELAXED);
> > > +__atomic_store_n(&sw->poll_lcores[n_lcores], lcore_id,
> > > +__ATOMIC_RELAXED);
> > >  }
> > >
> > >  ret = rte_mempool_get_bulk(sw->tim_pool, (void **)tims,
> > > --
> > > 2.7.4
> 


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH 3/3] eventdev: relax smp barriers with c11 atomics
  2020-06-12 11:19 ` [dpdk-dev] [PATCH 3/3] eventdev: relax smp barriers with c11 atomics Phil Yang
@ 2020-06-22 10:12   ` Phil Yang
  2020-06-23 19:38     ` Carrillo, Erik G
  0 siblings, 1 reply; 45+ messages in thread
From: Phil Yang @ 2020-06-22 10:12 UTC (permalink / raw)
  To: Phil Yang, dev, erik.g.carrillo
  Cc: drc, Honnappa Nagarahalli, Ruifeng Wang, Dharmik Thakkar, nd

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Phil Yang
> Sent: Friday, June 12, 2020 7:20 PM
> To: dev@dpdk.org; erik.g.carrillo@intel.com
> Cc: drc@linux.vnet.ibm.com; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Ruifeng Wang
> <Ruifeng.Wang@arm.com>; Dharmik Thakkar <Dharmik.Thakkar@arm.com>;
> nd <nd@arm.com>
> Subject: [dpdk-dev] [PATCH 3/3] eventdev: relax smp barriers with c11
> atomics
> 
> The implementation-specific opaque data is shared between arm and cancel
> operations. The state flag acts as a guard variable to make sure the
> update of opaque data is synchronized. This patch uses c11 atomics with
> explicit one way memory barrier instead of full barriers rte_smp_w/rmb()
> to synchronize the opaque data between timer arm and cancel threads.
> 
> Signed-off-by: Phil Yang <phil.yang@arm.com>
> Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> ---
>  lib/librte_eventdev/rte_event_timer_adapter.c | 55
> ++++++++++++++++++---------
>  lib/librte_eventdev/rte_event_timer_adapter.h |  2 +-
>  2 files changed, 38 insertions(+), 19 deletions(-)
> 
> diff --git a/lib/librte_eventdev/rte_event_timer_adapter.c
> b/lib/librte_eventdev/rte_event_timer_adapter.c
> index 6947efb..0a26501 100644
> --- a/lib/librte_eventdev/rte_event_timer_adapter.c
> +++ b/lib/librte_eventdev/rte_event_timer_adapter.c
> @@ -629,7 +629,8 @@ swtim_callback(struct rte_timer *tim)
>  		sw->expired_timers[sw->n_expired_timers++] = tim;
>  		sw->stats.evtim_exp_count++;
> 
> -		evtim->state = RTE_EVENT_TIMER_NOT_ARMED;
> +		__atomic_store_n(&evtim->state,
> RTE_EVENT_TIMER_NOT_ARMED,
> +				 __ATOMIC_RELEASE);
>  	}
> 
>  	if (event_buffer_batch_ready(&sw->buffer)) {
> @@ -1020,6 +1021,7 @@ __swtim_arm_burst(const struct
> rte_event_timer_adapter *adapter,
>  	int n_lcores;
>  	/* Timer is not armed state */
>  	int16_t exp_state = 0;
> +	enum rte_event_timer_state n_state;
> 
>  #ifdef RTE_LIBRTE_EVENTDEV_DEBUG
>  	/* Check that the service is running. */
> @@ -1060,30 +1062,36 @@ __swtim_arm_burst(const struct
> rte_event_timer_adapter *adapter,
>  	}
> 
>  	for (i = 0; i < nb_evtims; i++) {
> -		/* Don't modify the event timer state in these cases */
> -		if (evtims[i]->state == RTE_EVENT_TIMER_ARMED) {
> +		n_state = __atomic_load_n(&evtims[i]->state,
> __ATOMIC_RELAXED);
> +		if (n_state == RTE_EVENT_TIMER_ARMED) {
>  			rte_errno = EALREADY;
>  			break;
> -		} else if (!(evtims[i]->state ==
> RTE_EVENT_TIMER_NOT_ARMED ||
> -			     evtims[i]->state ==
> RTE_EVENT_TIMER_CANCELED)) {
> +		} else if (!(n_state == RTE_EVENT_TIMER_NOT_ARMED ||
> +			     n_state == RTE_EVENT_TIMER_CANCELED)) {
>  			rte_errno = EINVAL;
>  			break;
>  		}
> 
>  		ret = check_timeout(evtims[i], adapter);
>  		if (unlikely(ret == -1)) {
> -			evtims[i]->state =
> RTE_EVENT_TIMER_ERROR_TOOLATE;
> +			__atomic_store_n(&evtims[i]->state,
> +					RTE_EVENT_TIMER_ERROR_TOOLATE,
> +					__ATOMIC_RELAXED);
>  			rte_errno = EINVAL;
>  			break;
>  		} else if (unlikely(ret == -2)) {
> -			evtims[i]->state =
> RTE_EVENT_TIMER_ERROR_TOOEARLY;
> +			__atomic_store_n(&evtims[i]->state,
> +
> 	RTE_EVENT_TIMER_ERROR_TOOEARLY,
> +					__ATOMIC_RELAXED);
>  			rte_errno = EINVAL;
>  			break;
>  		}
> 
>  		if (unlikely(check_destination_event_queue(evtims[i],
>  							   adapter) < 0)) {
> -			evtims[i]->state = RTE_EVENT_TIMER_ERROR;
> +			__atomic_store_n(&evtims[i]->state,
> +					RTE_EVENT_TIMER_ERROR,
> +					__ATOMIC_RELAXED);
>  			rte_errno = EINVAL;
>  			break;
>  		}
> @@ -1099,13 +1107,18 @@ __swtim_arm_burst(const struct
> rte_event_timer_adapter *adapter,
>  					  SINGLE, lcore_id, NULL, evtims[i]);
>  		if (ret < 0) {
>  			/* tim was in RUNNING or CONFIG state */
> -			evtims[i]->state = RTE_EVENT_TIMER_ERROR;
> +			__atomic_store_n(&evtims[i]->state,
> +					RTE_EVENT_TIMER_ERROR,
> +					__ATOMIC_RELEASE);
>  			break;
>  		}
> 
> -		rte_smp_wmb();
>  		EVTIM_LOG_DBG("armed an event timer");
> -		evtims[i]->state = RTE_EVENT_TIMER_ARMED;
> +		/* RELEASE ordering guarantees the adapter specific value
> +		 * changes observed before the update of state.
> +		 */
> +		__atomic_store_n(&evtims[i]->state,
> RTE_EVENT_TIMER_ARMED,
> +				__ATOMIC_RELEASE);
>  	}
> 
>  	if (i < nb_evtims)
> @@ -1132,6 +1145,7 @@ swtim_cancel_burst(const struct
> rte_event_timer_adapter *adapter,
>  	struct rte_timer *timp;
>  	uint64_t opaque;
>  	struct swtim *sw = swtim_pmd_priv(adapter);
> +	enum rte_event_timer_state n_state;
> 
>  #ifdef RTE_LIBRTE_EVENTDEV_DEBUG
>  	/* Check that the service is running. */
> @@ -1143,16 +1157,18 @@ swtim_cancel_burst(const struct
> rte_event_timer_adapter *adapter,
> 
>  	for (i = 0; i < nb_evtims; i++) {
>  		/* Don't modify the event timer state in these cases */
> -		if (evtims[i]->state == RTE_EVENT_TIMER_CANCELED) {
> +		/* ACQUIRE ordering guarantees the access of
> implementation
> +		 * specific opague data under the correct state.
> +		 */
> +		n_state = __atomic_load_n(&evtims[i]->state,
> __ATOMIC_ACQUIRE);
> +		if (n_state == RTE_EVENT_TIMER_CANCELED) {
>  			rte_errno = EALREADY;
>  			break;
> -		} else if (evtims[i]->state != RTE_EVENT_TIMER_ARMED) {
> +		} else if (n_state != RTE_EVENT_TIMER_ARMED) {
>  			rte_errno = EINVAL;
>  			break;
>  		}
> 
> -		rte_smp_rmb();
> -
>  		opaque = evtims[i]->impl_opaque[0];
>  		timp = (struct rte_timer *)(uintptr_t)opaque;
>  		RTE_ASSERT(timp != NULL);
> @@ -1166,11 +1182,14 @@ swtim_cancel_burst(const struct
> rte_event_timer_adapter *adapter,
> 
>  		rte_mempool_put(sw->tim_pool, (void **)timp);
> 
> -		evtims[i]->state = RTE_EVENT_TIMER_CANCELED;
> +		__atomic_store_n(&evtims[i]->state,
> RTE_EVENT_TIMER_CANCELED,
> +				__ATOMIC_RELAXED);
>  		evtims[i]->impl_opaque[0] = 0;
>  		evtims[i]->impl_opaque[1] = 0;

Is that safe to remove impl_opaque cleanup above?

Once the soft timer canceled, the __swtim_arm_burst cannot access these two fields under the RTE_EVENT_TIMER_CANCELED state.
After new timer armed, it refills these two fields in the __swtim_arm_burst thread, which is the only producer of these two fields.
I think the only risk is that the values of these two field might be unknow after swtim_cancel_burst.  
So it should be safe to remove them if no other thread access them after canceling the timer. 

Any comments on this?
If we remove these two instructions, we can also remove the __atomic_thread_fence below to save performance penalty.

Thanks,
Phil

> -
> -		rte_smp_wmb();
> +		/* The RELEASE fence make sure the clean up
> +		 * of opaque data observed between threads.
> +		 */
> +		__atomic_thread_fence(__ATOMIC_RELEASE);
>  	}
> 
>  	return i;
> diff --git a/lib/librte_eventdev/rte_event_timer_adapter.h
> b/lib/librte_eventdev/rte_event_timer_adapter.h
> index d2ebcb0..6f64b90 100644
> --- a/lib/librte_eventdev/rte_event_timer_adapter.h
> +++ b/lib/librte_eventdev/rte_event_timer_adapter.h
> @@ -467,7 +467,7 @@ struct rte_event_timer {
>  	 *  - op: RTE_EVENT_OP_NEW
>  	 *  - event_type: RTE_EVENT_TYPE_TIMER
>  	 */
> -	volatile enum rte_event_timer_state state;
> +	enum rte_event_timer_state state;
>  	/**< State of the event timer. */
>  	uint64_t timeout_ticks;
>  	/**< Expiry timer ticks expressed in number of *timer_ticks_ns*
> from
> --
> 2.7.4


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH 3/3] eventdev: relax smp barriers with c11 atomics
  2020-06-22 10:12   ` Phil Yang
@ 2020-06-23 19:38     ` Carrillo, Erik G
  2020-06-28 17:33       ` Phil Yang
  0 siblings, 1 reply; 45+ messages in thread
From: Carrillo, Erik G @ 2020-06-23 19:38 UTC (permalink / raw)
  To: Phil Yang, dev
  Cc: drc, Honnappa Nagarahalli, Ruifeng Wang, Dharmik Thakkar, nd

Hi Phil,

Comment in-line:

> -----Original Message-----
> From: Phil Yang <Phil.Yang@arm.com>
> Sent: Monday, June 22, 2020 5:12 AM
> To: Phil Yang <Phil.Yang@arm.com>; dev@dpdk.org; Carrillo, Erik G
> <erik.g.carrillo@intel.com>
> Cc: drc@linux.vnet.ibm.com; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Ruifeng Wang
> <Ruifeng.Wang@arm.com>; Dharmik Thakkar
> <Dharmik.Thakkar@arm.com>; nd <nd@arm.com>
> Subject: RE: [dpdk-dev] [PATCH 3/3] eventdev: relax smp barriers with c11
> atomics
> 
> > -----Original Message-----
> > From: dev <dev-bounces@dpdk.org> On Behalf Of Phil Yang
> > Sent: Friday, June 12, 2020 7:20 PM
> > To: dev@dpdk.org; erik.g.carrillo@intel.com
> > Cc: drc@linux.vnet.ibm.com; Honnappa Nagarahalli
> > <Honnappa.Nagarahalli@arm.com>; Ruifeng Wang
> <Ruifeng.Wang@arm.com>;
> > Dharmik Thakkar <Dharmik.Thakkar@arm.com>; nd <nd@arm.com>
> > Subject: [dpdk-dev] [PATCH 3/3] eventdev: relax smp barriers with c11
> > atomics
> >
> > The implementation-specific opaque data is shared between arm and
> > cancel operations. The state flag acts as a guard variable to make
> > sure the update of opaque data is synchronized. This patch uses c11
> > atomics with explicit one way memory barrier instead of full barriers
> > rte_smp_w/rmb() to synchronize the opaque data between timer arm and
> cancel threads.
> >
> > Signed-off-by: Phil Yang <phil.yang@arm.com>
> > Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
> > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > ---
> >  lib/librte_eventdev/rte_event_timer_adapter.c | 55
> > ++++++++++++++++++---------
> >  lib/librte_eventdev/rte_event_timer_adapter.h |  2 +-
> >  2 files changed, 38 insertions(+), 19 deletions(-)
> >
> > diff --git a/lib/librte_eventdev/rte_event_timer_adapter.c
> > b/lib/librte_eventdev/rte_event_timer_adapter.c
> > index 6947efb..0a26501 100644
> > --- a/lib/librte_eventdev/rte_event_timer_adapter.c
> > +++ b/lib/librte_eventdev/rte_event_timer_adapter.c
> > @@ -629,7 +629,8 @@ swtim_callback(struct rte_timer *tim)
> >  		sw->expired_timers[sw->n_expired_timers++] = tim;
> >  		sw->stats.evtim_exp_count++;
> >
> > -		evtim->state = RTE_EVENT_TIMER_NOT_ARMED;
> > +		__atomic_store_n(&evtim->state,
> > RTE_EVENT_TIMER_NOT_ARMED,
> > +				 __ATOMIC_RELEASE);
> >  	}
> >
> >  	if (event_buffer_batch_ready(&sw->buffer)) { @@ -1020,6 +1021,7
> @@
> > __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
> >  	int n_lcores;
> >  	/* Timer is not armed state */
> >  	int16_t exp_state = 0;
> > +	enum rte_event_timer_state n_state;
> >
> >  #ifdef RTE_LIBRTE_EVENTDEV_DEBUG
> >  	/* Check that the service is running. */ @@ -1060,30 +1062,36 @@
> > __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
> >  	}
> >
> >  	for (i = 0; i < nb_evtims; i++) {
> > -		/* Don't modify the event timer state in these cases */
> > -		if (evtims[i]->state == RTE_EVENT_TIMER_ARMED) {
> > +		n_state = __atomic_load_n(&evtims[i]->state,
> > __ATOMIC_RELAXED);
> > +		if (n_state == RTE_EVENT_TIMER_ARMED) {
> >  			rte_errno = EALREADY;
> >  			break;
> > -		} else if (!(evtims[i]->state ==
> > RTE_EVENT_TIMER_NOT_ARMED ||
> > -			     evtims[i]->state ==
> > RTE_EVENT_TIMER_CANCELED)) {
> > +		} else if (!(n_state == RTE_EVENT_TIMER_NOT_ARMED ||
> > +			     n_state == RTE_EVENT_TIMER_CANCELED)) {
> >  			rte_errno = EINVAL;
> >  			break;
> >  		}
> >
> >  		ret = check_timeout(evtims[i], adapter);
> >  		if (unlikely(ret == -1)) {
> > -			evtims[i]->state =
> > RTE_EVENT_TIMER_ERROR_TOOLATE;
> > +			__atomic_store_n(&evtims[i]->state,
> > +
> 	RTE_EVENT_TIMER_ERROR_TOOLATE,
> > +					__ATOMIC_RELAXED);
> >  			rte_errno = EINVAL;
> >  			break;
> >  		} else if (unlikely(ret == -2)) {
> > -			evtims[i]->state =
> > RTE_EVENT_TIMER_ERROR_TOOEARLY;
> > +			__atomic_store_n(&evtims[i]->state,
> > +
> > 	RTE_EVENT_TIMER_ERROR_TOOEARLY,
> > +					__ATOMIC_RELAXED);
> >  			rte_errno = EINVAL;
> >  			break;
> >  		}
> >
> >  		if (unlikely(check_destination_event_queue(evtims[i],
> >  							   adapter) < 0)) {
> > -			evtims[i]->state = RTE_EVENT_TIMER_ERROR;
> > +			__atomic_store_n(&evtims[i]->state,
> > +					RTE_EVENT_TIMER_ERROR,
> > +					__ATOMIC_RELAXED);
> >  			rte_errno = EINVAL;
> >  			break;
> >  		}
> > @@ -1099,13 +1107,18 @@ __swtim_arm_burst(const struct
> > rte_event_timer_adapter *adapter,
> >  					  SINGLE, lcore_id, NULL, evtims[i]);
> >  		if (ret < 0) {
> >  			/* tim was in RUNNING or CONFIG state */
> > -			evtims[i]->state = RTE_EVENT_TIMER_ERROR;
> > +			__atomic_store_n(&evtims[i]->state,
> > +					RTE_EVENT_TIMER_ERROR,
> > +					__ATOMIC_RELEASE);
> >  			break;
> >  		}
> >
> > -		rte_smp_wmb();
> >  		EVTIM_LOG_DBG("armed an event timer");
> > -		evtims[i]->state = RTE_EVENT_TIMER_ARMED;
> > +		/* RELEASE ordering guarantees the adapter specific value
> > +		 * changes observed before the update of state.
> > +		 */
> > +		__atomic_store_n(&evtims[i]->state,
> > RTE_EVENT_TIMER_ARMED,
> > +				__ATOMIC_RELEASE);
> >  	}
> >
> >  	if (i < nb_evtims)
> > @@ -1132,6 +1145,7 @@ swtim_cancel_burst(const struct
> > rte_event_timer_adapter *adapter,
> >  	struct rte_timer *timp;
> >  	uint64_t opaque;
> >  	struct swtim *sw = swtim_pmd_priv(adapter);
> > +	enum rte_event_timer_state n_state;
> >
> >  #ifdef RTE_LIBRTE_EVENTDEV_DEBUG
> >  	/* Check that the service is running. */ @@ -1143,16 +1157,18 @@
> > swtim_cancel_burst(const struct rte_event_timer_adapter *adapter,
> >
> >  	for (i = 0; i < nb_evtims; i++) {
> >  		/* Don't modify the event timer state in these cases */
> > -		if (evtims[i]->state == RTE_EVENT_TIMER_CANCELED) {
> > +		/* ACQUIRE ordering guarantees the access of
> > implementation
> > +		 * specific opague data under the correct state.
> > +		 */
> > +		n_state = __atomic_load_n(&evtims[i]->state,
> > __ATOMIC_ACQUIRE);
> > +		if (n_state == RTE_EVENT_TIMER_CANCELED) {
> >  			rte_errno = EALREADY;
> >  			break;
> > -		} else if (evtims[i]->state != RTE_EVENT_TIMER_ARMED) {
> > +		} else if (n_state != RTE_EVENT_TIMER_ARMED) {
> >  			rte_errno = EINVAL;
> >  			break;
> >  		}
> >
> > -		rte_smp_rmb();
> > -
> >  		opaque = evtims[i]->impl_opaque[0];
> >  		timp = (struct rte_timer *)(uintptr_t)opaque;
> >  		RTE_ASSERT(timp != NULL);
> > @@ -1166,11 +1182,14 @@ swtim_cancel_burst(const struct
> > rte_event_timer_adapter *adapter,
> >
> >  		rte_mempool_put(sw->tim_pool, (void **)timp);
> >
> > -		evtims[i]->state = RTE_EVENT_TIMER_CANCELED;
> > +		__atomic_store_n(&evtims[i]->state,
> > RTE_EVENT_TIMER_CANCELED,
> > +				__ATOMIC_RELAXED);
> >  		evtims[i]->impl_opaque[0] = 0;
> >  		evtims[i]->impl_opaque[1] = 0;
> 
> Is that safe to remove impl_opaque cleanup above?
> 
> Once the soft timer canceled, the __swtim_arm_burst cannot access these
> two fields under the RTE_EVENT_TIMER_CANCELED state.
> After new timer armed, it refills these two fields in the __swtim_arm_burst
> thread, which is the only producer of these two fields.
> I think the only risk is that the values of these two field might be unknow
> after swtim_cancel_burst.
> So it should be safe to remove them if no other thread access them after
> canceling the timer.
> 
> Any comments on this?
> If we remove these two instructions, we can also remove the
> __atomic_thread_fence below to save performance penalty.
> 
> Thanks,
> Phil
> 

In this case, I see the fence as (more importantly) ensuring the state update is visible to other threads... do I misunderstand?   I suppose we could also update the state with an __atomic_store(..., __ATOMIC_RELEASE), but perhaps that roughly equivalent?

> > -
> > -		rte_smp_wmb();
> > +		/* The RELEASE fence make sure the clean up
> > +		 * of opaque data observed between threads.
> > +		 */
> > +		__atomic_thread_fence(__ATOMIC_RELEASE);
> >  	}
> >
> >  	return i;
> > diff --git a/lib/librte_eventdev/rte_event_timer_adapter.h
> > b/lib/librte_eventdev/rte_event_timer_adapter.h
> > index d2ebcb0..6f64b90 100644
> > --- a/lib/librte_eventdev/rte_event_timer_adapter.h
> > +++ b/lib/librte_eventdev/rte_event_timer_adapter.h
> > @@ -467,7 +467,7 @@ struct rte_event_timer {
> >  	 *  - op: RTE_EVENT_OP_NEW
> >  	 *  - event_type: RTE_EVENT_TYPE_TIMER
> >  	 */
> > -	volatile enum rte_event_timer_state state;
> > +	enum rte_event_timer_state state;
> >  	/**< State of the event timer. */
> >  	uint64_t timeout_ticks;
> >  	/**< Expiry timer ticks expressed in number of *timer_ticks_ns*
> from
> > --
> > 2.7.4


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH 2/3] eventdev: use c11 atomics for lcore timer armed flag
  2020-06-12 11:19 ` [dpdk-dev] [PATCH 2/3] eventdev: use c11 atomics for lcore timer armed flag Phil Yang
@ 2020-06-23 21:01   ` Carrillo, Erik G
  2020-06-28 16:12     ` Phil Yang
  2020-06-23 21:20   ` Stephen Hemminger
  1 sibling, 1 reply; 45+ messages in thread
From: Carrillo, Erik G @ 2020-06-23 21:01 UTC (permalink / raw)
  To: Phil Yang, dev
  Cc: drc, honnappa.nagarahalli, ruifeng.wang, dharmik.thakkar, nd

Hi Phil,

Comment in-line:

> -----Original Message-----
> From: Phil Yang <phil.yang@arm.com>
> Sent: Friday, June 12, 2020 6:20 AM
> To: dev@dpdk.org; Carrillo, Erik G <erik.g.carrillo@intel.com>
> Cc: drc@linux.vnet.ibm.com; honnappa.nagarahalli@arm.com;
> ruifeng.wang@arm.com; dharmik.thakkar@arm.com; nd@arm.com
> Subject: [PATCH 2/3] eventdev: use c11 atomics for lcore timer armed flag
> 
> The in_use flag is a per core variable which is not shared between lcores in
> the normal case and the access of this variable should be ordered on the
> same core. However, if non-EAL thread pick the highest lcore to insert timers
> into, there is the possibility of conflicts on this flag between threads. Then
> the atomic CAS operation is needed.
> 
> Use the c11 atomic CAS instead of the generic rte_atomic operations to avoid
> the unnecessary barrier on aarch64.
> 
> Signed-off-by: Phil Yang <phil.yang@arm.com>
> Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> ---
>  lib/librte_eventdev/rte_event_timer_adapter.c | 15 +++++++++++----
>  1 file changed, 11 insertions(+), 4 deletions(-)
> 
> diff --git a/lib/librte_eventdev/rte_event_timer_adapter.c
> b/lib/librte_eventdev/rte_event_timer_adapter.c
> index 6a0e283..6947efb 100644
> --- a/lib/librte_eventdev/rte_event_timer_adapter.c
> +++ b/lib/librte_eventdev/rte_event_timer_adapter.c
> @@ -554,7 +554,7 @@ struct swtim {
>  	uint32_t timer_data_id;
>  	/* Track which cores have actually armed a timer */
>  	struct {
> -		rte_atomic16_t v;
> +		int16_t v;
>  	} __rte_cache_aligned in_use[RTE_MAX_LCORE];
>  	/* Track which cores' timer lists should be polled */
>  	unsigned int poll_lcores[RTE_MAX_LCORE]; @@ -606,7 +606,8 @@
> swtim_callback(struct rte_timer *tim)
>  				      "with immediate expiry value");
>  		}
> 
> -		if (unlikely(rte_atomic16_test_and_set(&sw-
> >in_use[lcore].v))) {
> +		if (unlikely(sw->in_use[lcore].v == 0)) {
> +			sw->in_use[lcore].v = 1;
>  			n_lcores = __atomic_fetch_add(&sw->n_poll_lcores,
> 1,
>  						__ATOMIC_RELAXED);
>  			__atomic_store_n(&sw->poll_lcores[n_lcores],
> lcore, @@ -834,7 +835,7 @@ swtim_init(struct rte_event_timer_adapter
> *adapter)
> 
>  	/* Initialize the variables that track in-use timer lists */
>  	for (i = 0; i < RTE_MAX_LCORE; i++)
> -		rte_atomic16_init(&sw->in_use[i].v);
> +		sw->in_use[i].v = 0;
> 
>  	/* Initialize the timer subsystem and allocate timer data instance */
>  	ret = rte_timer_subsystem_init();
> @@ -1017,6 +1018,8 @@ __swtim_arm_burst(const struct
> rte_event_timer_adapter *adapter,
>  	struct rte_timer *tim, *tims[nb_evtims];
>  	uint64_t cycles;
>  	int n_lcores;
> +	/* Timer is not armed state */

A more accurate comment would be something like "Timer list for this lcore is not in use".

With that change, it looks good to me:
Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>

> +	int16_t exp_state = 0;
> 
>  #ifdef RTE_LIBRTE_EVENTDEV_DEBUG
>  	/* Check that the service is running. */ @@ -1035,8 +1038,12 @@
> __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
>  	/* If this is the first time we're arming an event timer on this lcore,
>  	 * mark this lcore as "in use"; this will cause the service
>  	 * function to process the timer list that corresponds to this lcore.
> +	 * The atomic CAS operation can prevent the race condition on
> in_use
> +	 * flag between multiple non-EAL threads.
>  	 */
> -	if (unlikely(rte_atomic16_test_and_set(&sw->in_use[lcore_id].v))) {
> +	if (unlikely(__atomic_compare_exchange_n(&sw-
> >in_use[lcore_id].v,
> +			&exp_state, 1, 0,
> +			__ATOMIC_RELAXED, __ATOMIC_RELAXED))) {
>  		EVTIM_LOG_DBG("Adding lcore id = %u to list of lcores to
> poll",
>  			      lcore_id);
>  		n_lcores = __atomic_fetch_add(&sw->n_poll_lcores, 1,
> --
> 2.7.4


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH 2/3] eventdev: use c11 atomics for lcore timer armed flag
  2020-06-12 11:19 ` [dpdk-dev] [PATCH 2/3] eventdev: use c11 atomics for lcore timer armed flag Phil Yang
  2020-06-23 21:01   ` Carrillo, Erik G
@ 2020-06-23 21:20   ` Stephen Hemminger
  2020-06-23 21:31     ` Carrillo, Erik G
  1 sibling, 1 reply; 45+ messages in thread
From: Stephen Hemminger @ 2020-06-23 21:20 UTC (permalink / raw)
  To: Phil Yang
  Cc: dev, erik.g.carrillo, drc, honnappa.nagarahalli, ruifeng.wang,
	dharmik.thakkar, nd

On Fri, 12 Jun 2020 19:19:57 +0800
Phil Yang <phil.yang@arm.com> wrote:

>  	/* Track which cores have actually armed a timer */
>  	struct {
> -		rte_atomic16_t v;
> +		int16_t v;
>  	} __rte_cache_aligned in_use[RTE_MAX_LCORE];

Do you really need this to be cache aligned (ie one per line)?
Why have a signed value for a reference count? Shouldn't it be unsigned?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH 2/3] eventdev: use c11 atomics for lcore timer armed flag
  2020-06-23 21:20   ` Stephen Hemminger
@ 2020-06-23 21:31     ` Carrillo, Erik G
  2020-06-28 16:32       ` Phil Yang
  0 siblings, 1 reply; 45+ messages in thread
From: Carrillo, Erik G @ 2020-06-23 21:31 UTC (permalink / raw)
  To: Stephen Hemminger, Phil Yang
  Cc: dev, drc, honnappa.nagarahalli, ruifeng.wang, dharmik.thakkar, nd

> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Tuesday, June 23, 2020 4:20 PM
> To: Phil Yang <phil.yang@arm.com>
> Cc: dev@dpdk.org; Carrillo, Erik G <erik.g.carrillo@intel.com>;
> drc@linux.vnet.ibm.com; honnappa.nagarahalli@arm.com;
> ruifeng.wang@arm.com; dharmik.thakkar@arm.com; nd@arm.com
> Subject: Re: [dpdk-dev] [PATCH 2/3] eventdev: use c11 atomics for lcore
> timer armed flag
> 
> On Fri, 12 Jun 2020 19:19:57 +0800
> Phil Yang <phil.yang@arm.com> wrote:
> 
> >  	/* Track which cores have actually armed a timer */
> >  	struct {
> > -		rte_atomic16_t v;
> > +		int16_t v;
> >  	} __rte_cache_aligned in_use[RTE_MAX_LCORE];
> 
> Do you really need this to be cache aligned (ie one per line)?

I believe I did this originally to keep a cache line from bouncing when two different cores are arming timers, so it's not strictly necessary.

> Why have a signed value for a reference count? Shouldn't it be unsigned?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH 2/3] eventdev: use c11 atomics for lcore timer armed flag
  2020-06-23 21:01   ` Carrillo, Erik G
@ 2020-06-28 16:12     ` Phil Yang
  0 siblings, 0 replies; 45+ messages in thread
From: Phil Yang @ 2020-06-28 16:12 UTC (permalink / raw)
  To: Carrillo, Erik G, dev
  Cc: drc, Honnappa Nagarahalli, Ruifeng Wang, Dharmik Thakkar, nd

Hi Erik,

Sorry, I was on vacation.
Thanks for your feedback. I will update it in the next version.

Thanks,
Phil

> -----Original Message-----
> From: Carrillo, Erik G <erik.g.carrillo@intel.com>
> Sent: Wednesday, June 24, 2020 5:02 AM
> To: Phil Yang <Phil.Yang@arm.com>; dev@dpdk.org
> Cc: drc@linux.vnet.ibm.com; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Ruifeng Wang
> <Ruifeng.Wang@arm.com>; Dharmik Thakkar <Dharmik.Thakkar@arm.com>;
> nd <nd@arm.com>
> Subject: RE: [PATCH 2/3] eventdev: use c11 atomics for lcore timer armed
> flag
> 
> Hi Phil,
> 
> Comment in-line:
> 
> > -----Original Message-----
> > From: Phil Yang <phil.yang@arm.com>
> > Sent: Friday, June 12, 2020 6:20 AM
> > To: dev@dpdk.org; Carrillo, Erik G <erik.g.carrillo@intel.com>
> > Cc: drc@linux.vnet.ibm.com; honnappa.nagarahalli@arm.com;
> > ruifeng.wang@arm.com; dharmik.thakkar@arm.com; nd@arm.com
> > Subject: [PATCH 2/3] eventdev: use c11 atomics for lcore timer armed flag
> >
> > The in_use flag is a per core variable which is not shared between lcores in
> > the normal case and the access of this variable should be ordered on the
> > same core. However, if non-EAL thread pick the highest lcore to insert
> timers
> > into, there is the possibility of conflicts on this flag between threads. Then
> > the atomic CAS operation is needed.
> >
> > Use the c11 atomic CAS instead of the generic rte_atomic operations to
> avoid
> > the unnecessary barrier on aarch64.
> >
> > Signed-off-by: Phil Yang <phil.yang@arm.com>
> > Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
> > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > ---
> >  lib/librte_eventdev/rte_event_timer_adapter.c | 15 +++++++++++----
> >  1 file changed, 11 insertions(+), 4 deletions(-)
> >
> > diff --git a/lib/librte_eventdev/rte_event_timer_adapter.c
> > b/lib/librte_eventdev/rte_event_timer_adapter.c
> > index 6a0e283..6947efb 100644
> > --- a/lib/librte_eventdev/rte_event_timer_adapter.c
> > +++ b/lib/librte_eventdev/rte_event_timer_adapter.c
> > @@ -554,7 +554,7 @@ struct swtim {
> >  	uint32_t timer_data_id;
> >  	/* Track which cores have actually armed a timer */
> >  	struct {
> > -		rte_atomic16_t v;
> > +		int16_t v;
> >  	} __rte_cache_aligned in_use[RTE_MAX_LCORE];
> >  	/* Track which cores' timer lists should be polled */
> >  	unsigned int poll_lcores[RTE_MAX_LCORE]; @@ -606,7 +606,8 @@
> > swtim_callback(struct rte_timer *tim)
> >  				      "with immediate expiry value");
> >  		}
> >
> > -		if (unlikely(rte_atomic16_test_and_set(&sw-
> > >in_use[lcore].v))) {
> > +		if (unlikely(sw->in_use[lcore].v == 0)) {
> > +			sw->in_use[lcore].v = 1;
> >  			n_lcores = __atomic_fetch_add(&sw->n_poll_lcores,
> > 1,
> >  						__ATOMIC_RELAXED);
> >  			__atomic_store_n(&sw->poll_lcores[n_lcores],
> > lcore, @@ -834,7 +835,7 @@ swtim_init(struct rte_event_timer_adapter
> > *adapter)
> >
> >  	/* Initialize the variables that track in-use timer lists */
> >  	for (i = 0; i < RTE_MAX_LCORE; i++)
> > -		rte_atomic16_init(&sw->in_use[i].v);
> > +		sw->in_use[i].v = 0;
> >
> >  	/* Initialize the timer subsystem and allocate timer data instance */
> >  	ret = rte_timer_subsystem_init();
> > @@ -1017,6 +1018,8 @@ __swtim_arm_burst(const struct
> > rte_event_timer_adapter *adapter,
> >  	struct rte_timer *tim, *tims[nb_evtims];
> >  	uint64_t cycles;
> >  	int n_lcores;
> > +	/* Timer is not armed state */
> 
> A more accurate comment would be something like "Timer list for this lcore is
> not in use".
> 
> With that change, it looks good to me:
> Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
> 
> > +	int16_t exp_state = 0;
> >
> >  #ifdef RTE_LIBRTE_EVENTDEV_DEBUG
> >  	/* Check that the service is running. */ @@ -1035,8 +1038,12 @@
> > __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
> >  	/* If this is the first time we're arming an event timer on this lcore,
> >  	 * mark this lcore as "in use"; this will cause the service
> >  	 * function to process the timer list that corresponds to this lcore.
> > +	 * The atomic CAS operation can prevent the race condition on
> > in_use
> > +	 * flag between multiple non-EAL threads.
> >  	 */
> > -	if (unlikely(rte_atomic16_test_and_set(&sw->in_use[lcore_id].v))) {
> > +	if (unlikely(__atomic_compare_exchange_n(&sw-
> > >in_use[lcore_id].v,
> > +			&exp_state, 1, 0,
> > +			__ATOMIC_RELAXED, __ATOMIC_RELAXED))) {
> >  		EVTIM_LOG_DBG("Adding lcore id = %u to list of lcores to
> > poll",
> >  			      lcore_id);
> >  		n_lcores = __atomic_fetch_add(&sw->n_poll_lcores, 1,
> > --
> > 2.7.4


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH 2/3] eventdev: use c11 atomics for lcore timer armed flag
  2020-06-23 21:31     ` Carrillo, Erik G
@ 2020-06-28 16:32       ` Phil Yang
  0 siblings, 0 replies; 45+ messages in thread
From: Phil Yang @ 2020-06-28 16:32 UTC (permalink / raw)
  To: Carrillo, Erik G, Stephen Hemminger
  Cc: dev, drc, Honnappa Nagarahalli, Ruifeng Wang, Dharmik Thakkar, nd

> -----Original Message-----
> From: Carrillo, Erik G <erik.g.carrillo@intel.com>
> Sent: Wednesday, June 24, 2020 5:32 AM
> To: Stephen Hemminger <stephen@networkplumber.org>; Phil Yang
> <Phil.Yang@arm.com>
> Cc: dev@dpdk.org; drc@linux.vnet.ibm.com; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Ruifeng Wang
> <Ruifeng.Wang@arm.com>; Dharmik Thakkar <Dharmik.Thakkar@arm.com>;
> nd <nd@arm.com>
> Subject: RE: [dpdk-dev] [PATCH 2/3] eventdev: use c11 atomics for lcore
> timer armed flag
> 
> > -----Original Message-----
> > From: Stephen Hemminger <stephen@networkplumber.org>
> > Sent: Tuesday, June 23, 2020 4:20 PM
> > To: Phil Yang <phil.yang@arm.com>
> > Cc: dev@dpdk.org; Carrillo, Erik G <erik.g.carrillo@intel.com>;
> > drc@linux.vnet.ibm.com; honnappa.nagarahalli@arm.com;
> > ruifeng.wang@arm.com; dharmik.thakkar@arm.com; nd@arm.com
> > Subject: Re: [dpdk-dev] [PATCH 2/3] eventdev: use c11 atomics for lcore
> > timer armed flag
> >
> > On Fri, 12 Jun 2020 19:19:57 +0800
> > Phil Yang <phil.yang@arm.com> wrote:
> >
> > >  	/* Track which cores have actually armed a timer */
> > >  	struct {
> > > -		rte_atomic16_t v;
> > > +		int16_t v;
> > >  	} __rte_cache_aligned in_use[RTE_MAX_LCORE];
> >
> > Do you really need this to be cache aligned (ie one per line)?
> 
> I believe I did this originally to keep a cache line from bouncing when two
> different cores are arming timers, so it's not strictly necessary.

Yeah, if we remove it, these per core variables might cause a false sharing issue between threads. 
That will hurt performance.

> 
> > Why have a signed value for a reference count? Shouldn't it be unsigned?

Yes. It should be unsigned in the new code. 
I will update it in the next version.

Thanks,
Phil

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH 3/3] eventdev: relax smp barriers with c11 atomics
  2020-06-23 19:38     ` Carrillo, Erik G
@ 2020-06-28 17:33       ` Phil Yang
  2020-06-29 18:07         ` Carrillo, Erik G
  0 siblings, 1 reply; 45+ messages in thread
From: Phil Yang @ 2020-06-28 17:33 UTC (permalink / raw)
  To: Carrillo, Erik G, dev
  Cc: drc, Honnappa Nagarahalli, Ruifeng Wang, Dharmik Thakkar, nd

Hi Erick,

> -----Original Message-----
> From: Carrillo, Erik G <erik.g.carrillo@intel.com>
> Sent: Wednesday, June 24, 2020 3:39 AM
> To: Phil Yang <Phil.Yang@arm.com>; dev@dpdk.org
> Cc: drc@linux.vnet.ibm.com; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Ruifeng Wang
> <Ruifeng.Wang@arm.com>; Dharmik Thakkar <Dharmik.Thakkar@arm.com>;
> nd <nd@arm.com>
> Subject: RE: [dpdk-dev] [PATCH 3/3] eventdev: relax smp barriers with c11
> atomics
> 
> Hi Phil,
> 
> Comment in-line:
> 
> > -----Original Message-----
> > From: Phil Yang <Phil.Yang@arm.com>
> > Sent: Monday, June 22, 2020 5:12 AM
> > To: Phil Yang <Phil.Yang@arm.com>; dev@dpdk.org; Carrillo, Erik G
> > <erik.g.carrillo@intel.com>
> > Cc: drc@linux.vnet.ibm.com; Honnappa Nagarahalli
> > <Honnappa.Nagarahalli@arm.com>; Ruifeng Wang
> > <Ruifeng.Wang@arm.com>; Dharmik Thakkar
> > <Dharmik.Thakkar@arm.com>; nd <nd@arm.com>
> > Subject: RE: [dpdk-dev] [PATCH 3/3] eventdev: relax smp barriers with c11
> > atomics
> >
> > > -----Original Message-----
> > > From: dev <dev-bounces@dpdk.org> On Behalf Of Phil Yang
> > > Sent: Friday, June 12, 2020 7:20 PM
> > > To: dev@dpdk.org; erik.g.carrillo@intel.com
> > > Cc: drc@linux.vnet.ibm.com; Honnappa Nagarahalli
> > > <Honnappa.Nagarahalli@arm.com>; Ruifeng Wang
> > <Ruifeng.Wang@arm.com>;
> > > Dharmik Thakkar <Dharmik.Thakkar@arm.com>; nd <nd@arm.com>
> > > Subject: [dpdk-dev] [PATCH 3/3] eventdev: relax smp barriers with c11
> > > atomics
> > >
> > > The implementation-specific opaque data is shared between arm and
> > > cancel operations. The state flag acts as a guard variable to make
> > > sure the update of opaque data is synchronized. This patch uses c11
> > > atomics with explicit one way memory barrier instead of full barriers
> > > rte_smp_w/rmb() to synchronize the opaque data between timer arm
> and
> > cancel threads.
> > >
> > > Signed-off-by: Phil Yang <phil.yang@arm.com>
> > > Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
> > > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > > ---
> > >  lib/librte_eventdev/rte_event_timer_adapter.c | 55
> > > ++++++++++++++++++---------
> > >  lib/librte_eventdev/rte_event_timer_adapter.h |  2 +-
> > >  2 files changed, 38 insertions(+), 19 deletions(-)
> > >
> > > diff --git a/lib/librte_eventdev/rte_event_timer_adapter.c
> > > b/lib/librte_eventdev/rte_event_timer_adapter.c
> > > index 6947efb..0a26501 100644
> > > --- a/lib/librte_eventdev/rte_event_timer_adapter.c
> > > +++ b/lib/librte_eventdev/rte_event_timer_adapter.c
> > > @@ -629,7 +629,8 @@ swtim_callback(struct rte_timer *tim)
> > >  		sw->expired_timers[sw->n_expired_timers++] = tim;
> > >  		sw->stats.evtim_exp_count++;
> > >
> > > -		evtim->state = RTE_EVENT_TIMER_NOT_ARMED;
> > > +		__atomic_store_n(&evtim->state,
> > > RTE_EVENT_TIMER_NOT_ARMED,
> > > +				 __ATOMIC_RELEASE);
> > >  	}
> > >
> > >  	if (event_buffer_batch_ready(&sw->buffer)) { @@ -1020,6 +1021,7
> > @@
> > > __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
> > >  	int n_lcores;
> > >  	/* Timer is not armed state */
> > >  	int16_t exp_state = 0;
> > > +	enum rte_event_timer_state n_state;
> > >
> > >  #ifdef RTE_LIBRTE_EVENTDEV_DEBUG
> > >  	/* Check that the service is running. */ @@ -1060,30 +1062,36 @@
> > > __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
> > >  	}
> > >
> > >  	for (i = 0; i < nb_evtims; i++) {
> > > -		/* Don't modify the event timer state in these cases */
> > > -		if (evtims[i]->state == RTE_EVENT_TIMER_ARMED) {
> > > +		n_state = __atomic_load_n(&evtims[i]->state,
> > > __ATOMIC_RELAXED);
> > > +		if (n_state == RTE_EVENT_TIMER_ARMED) {
> > >  			rte_errno = EALREADY;
> > >  			break;
> > > -		} else if (!(evtims[i]->state ==
> > > RTE_EVENT_TIMER_NOT_ARMED ||
> > > -			     evtims[i]->state ==
> > > RTE_EVENT_TIMER_CANCELED)) {
> > > +		} else if (!(n_state == RTE_EVENT_TIMER_NOT_ARMED ||
> > > +			     n_state == RTE_EVENT_TIMER_CANCELED)) {
> > >  			rte_errno = EINVAL;
> > >  			break;
> > >  		}
> > >
> > >  		ret = check_timeout(evtims[i], adapter);
> > >  		if (unlikely(ret == -1)) {
> > > -			evtims[i]->state =
> > > RTE_EVENT_TIMER_ERROR_TOOLATE;
> > > +			__atomic_store_n(&evtims[i]->state,
> > > +
> > 	RTE_EVENT_TIMER_ERROR_TOOLATE,
> > > +					__ATOMIC_RELAXED);
> > >  			rte_errno = EINVAL;
> > >  			break;
> > >  		} else if (unlikely(ret == -2)) {
> > > -			evtims[i]->state =
> > > RTE_EVENT_TIMER_ERROR_TOOEARLY;
> > > +			__atomic_store_n(&evtims[i]->state,
> > > +
> > > 	RTE_EVENT_TIMER_ERROR_TOOEARLY,
> > > +					__ATOMIC_RELAXED);
> > >  			rte_errno = EINVAL;
> > >  			break;
> > >  		}
> > >
> > >  		if (unlikely(check_destination_event_queue(evtims[i],
> > >  							   adapter) < 0)) {
> > > -			evtims[i]->state = RTE_EVENT_TIMER_ERROR;
> > > +			__atomic_store_n(&evtims[i]->state,
> > > +					RTE_EVENT_TIMER_ERROR,
> > > +					__ATOMIC_RELAXED);
> > >  			rte_errno = EINVAL;
> > >  			break;
> > >  		}
> > > @@ -1099,13 +1107,18 @@ __swtim_arm_burst(const struct
> > > rte_event_timer_adapter *adapter,
> > >  					  SINGLE, lcore_id, NULL, evtims[i]);
> > >  		if (ret < 0) {
> > >  			/* tim was in RUNNING or CONFIG state */
> > > -			evtims[i]->state = RTE_EVENT_TIMER_ERROR;
> > > +			__atomic_store_n(&evtims[i]->state,
> > > +					RTE_EVENT_TIMER_ERROR,
> > > +					__ATOMIC_RELEASE);
> > >  			break;
> > >  		}
> > >
> > > -		rte_smp_wmb();
> > >  		EVTIM_LOG_DBG("armed an event timer");
> > > -		evtims[i]->state = RTE_EVENT_TIMER_ARMED;
> > > +		/* RELEASE ordering guarantees the adapter specific value
> > > +		 * changes observed before the update of state.
> > > +		 */
> > > +		__atomic_store_n(&evtims[i]->state,
> > > RTE_EVENT_TIMER_ARMED,
> > > +				__ATOMIC_RELEASE);
> > >  	}
> > >
> > >  	if (i < nb_evtims)
> > > @@ -1132,6 +1145,7 @@ swtim_cancel_burst(const struct
> > > rte_event_timer_adapter *adapter,
> > >  	struct rte_timer *timp;
> > >  	uint64_t opaque;
> > >  	struct swtim *sw = swtim_pmd_priv(adapter);
> > > +	enum rte_event_timer_state n_state;
> > >
> > >  #ifdef RTE_LIBRTE_EVENTDEV_DEBUG
> > >  	/* Check that the service is running. */ @@ -1143,16 +1157,18 @@
> > > swtim_cancel_burst(const struct rte_event_timer_adapter *adapter,
> > >
> > >  	for (i = 0; i < nb_evtims; i++) {
> > >  		/* Don't modify the event timer state in these cases */
> > > -		if (evtims[i]->state == RTE_EVENT_TIMER_CANCELED) {
> > > +		/* ACQUIRE ordering guarantees the access of
> > > implementation
> > > +		 * specific opague data under the correct state.
> > > +		 */
> > > +		n_state = __atomic_load_n(&evtims[i]->state,
> > > __ATOMIC_ACQUIRE);
> > > +		if (n_state == RTE_EVENT_TIMER_CANCELED) {
> > >  			rte_errno = EALREADY;
> > >  			break;
> > > -		} else if (evtims[i]->state != RTE_EVENT_TIMER_ARMED) {
> > > +		} else if (n_state != RTE_EVENT_TIMER_ARMED) {
> > >  			rte_errno = EINVAL;
> > >  			break;
> > >  		}
> > >
> > > -		rte_smp_rmb();
> > > -
> > >  		opaque = evtims[i]->impl_opaque[0];
> > >  		timp = (struct rte_timer *)(uintptr_t)opaque;
> > >  		RTE_ASSERT(timp != NULL);
> > > @@ -1166,11 +1182,14 @@ swtim_cancel_burst(const struct
> > > rte_event_timer_adapter *adapter,
> > >
> > >  		rte_mempool_put(sw->tim_pool, (void **)timp);
> > >
> > > -		evtims[i]->state = RTE_EVENT_TIMER_CANCELED;
> > > +		__atomic_store_n(&evtims[i]->state,
> > > RTE_EVENT_TIMER_CANCELED,
> > > +				__ATOMIC_RELAXED);
> > >  		evtims[i]->impl_opaque[0] = 0;
> > >  		evtims[i]->impl_opaque[1] = 0;
> >
> > Is that safe to remove impl_opaque cleanup above?
> >
> > Once the soft timer canceled, the __swtim_arm_burst cannot access these
> > two fields under the RTE_EVENT_TIMER_CANCELED state.
> > After new timer armed, it refills these two fields in the __swtim_arm_burst
> > thread, which is the only producer of these two fields.
> > I think the only risk is that the values of these two field might be unknow
> > after swtim_cancel_burst.
> > So it should be safe to remove them if no other thread access them after
> > canceling the timer.
> >
> > Any comments on this?
> > If we remove these two instructions, we can also remove the
> > __atomic_thread_fence below to save performance penalty.
> >
> > Thanks,
> > Phil
> >
> 
> In this case, I see the fence as (more importantly) ensuring the state update
> is visible to other threads... do I misunderstand?   I suppose we could also
> update the state with an __atomic_store(..., __ATOMIC_RELEASE), but
> perhaps that roughly equivalent?

Yeah. In my understanding, the fence ensures the state and the implementation-specific opaque data update are visible between other timer arm and cancel threads.
Actually, we only care about the state's value here. 
The atomic RELEASE can also make sure all writes in the current thread are visible in other threads that acquire the same atomic variable. 
So I think we can remove the fence and update the state with RELEASE then load the state with ACQUIRE in the timer arm and the cancel threads to achieve the same goal.

> 
> > > -
> > > -		rte_smp_wmb();
> > > +		/* The RELEASE fence make sure the clean up
> > > +		 * of opaque data observed between threads.
> > > +		 */
> > > +		__atomic_thread_fence(__ATOMIC_RELEASE);
> > >  	}
> > >
> > >  	return i;
> > > diff --git a/lib/librte_eventdev/rte_event_timer_adapter.h
> > > b/lib/librte_eventdev/rte_event_timer_adapter.h
> > > index d2ebcb0..6f64b90 100644
> > > --- a/lib/librte_eventdev/rte_event_timer_adapter.h
> > > +++ b/lib/librte_eventdev/rte_event_timer_adapter.h
> > > @@ -467,7 +467,7 @@ struct rte_event_timer {
> > >  	 *  - op: RTE_EVENT_OP_NEW
> > >  	 *  - event_type: RTE_EVENT_TYPE_TIMER
> > >  	 */
> > > -	volatile enum rte_event_timer_state state;
> > > +	enum rte_event_timer_state state;
> > >  	/**< State of the event timer. */
> > >  	uint64_t timeout_ticks;
> > >  	/**< Expiry timer ticks expressed in number of *timer_ticks_ns*
> > from
> > > --
> > > 2.7.4


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH 3/3] eventdev: relax smp barriers with c11 atomics
  2020-06-28 17:33       ` Phil Yang
@ 2020-06-29 18:07         ` Carrillo, Erik G
  0 siblings, 0 replies; 45+ messages in thread
From: Carrillo, Erik G @ 2020-06-29 18:07 UTC (permalink / raw)
  To: Phil Yang, dev
  Cc: drc, Honnappa Nagarahalli, Ruifeng Wang, Dharmik Thakkar, nd

> -----Original Message-----
> From: Phil Yang <Phil.Yang@arm.com>
> Sent: Sunday, June 28, 2020 12:34 PM
> To: Carrillo, Erik G <erik.g.carrillo@intel.com>; dev@dpdk.org
> Cc: drc@linux.vnet.ibm.com; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Ruifeng Wang
> <Ruifeng.Wang@arm.com>; Dharmik Thakkar
> <Dharmik.Thakkar@arm.com>; nd <nd@arm.com>
> Subject: RE: [dpdk-dev] [PATCH 3/3] eventdev: relax smp barriers with c11
> atomics
> 
> Hi Erick,
> 
> > -----Original Message-----
> > From: Carrillo, Erik G <erik.g.carrillo@intel.com>
> > Sent: Wednesday, June 24, 2020 3:39 AM
> > To: Phil Yang <Phil.Yang@arm.com>; dev@dpdk.org
> > Cc: drc@linux.vnet.ibm.com; Honnappa Nagarahalli
> > <Honnappa.Nagarahalli@arm.com>; Ruifeng Wang
> <Ruifeng.Wang@arm.com>;
> > Dharmik Thakkar <Dharmik.Thakkar@arm.com>; nd <nd@arm.com>
> > Subject: RE: [dpdk-dev] [PATCH 3/3] eventdev: relax smp barriers with
> > c11 atomics
> >
> > Hi Phil,
> >
> > Comment in-line:
> >
> > > -----Original Message-----
> > > From: Phil Yang <Phil.Yang@arm.com>
> > > Sent: Monday, June 22, 2020 5:12 AM
> > > To: Phil Yang <Phil.Yang@arm.com>; dev@dpdk.org; Carrillo, Erik G
> > > <erik.g.carrillo@intel.com>
> > > Cc: drc@linux.vnet.ibm.com; Honnappa Nagarahalli
> > > <Honnappa.Nagarahalli@arm.com>; Ruifeng Wang
> <Ruifeng.Wang@arm.com>;
> > > Dharmik Thakkar <Dharmik.Thakkar@arm.com>; nd <nd@arm.com>
> > > Subject: RE: [dpdk-dev] [PATCH 3/3] eventdev: relax smp barriers
> > > with c11 atomics
> > >
> > > > -----Original Message-----
> > > > From: dev <dev-bounces@dpdk.org> On Behalf Of Phil Yang
> > > > Sent: Friday, June 12, 2020 7:20 PM
> > > > To: dev@dpdk.org; erik.g.carrillo@intel.com
> > > > Cc: drc@linux.vnet.ibm.com; Honnappa Nagarahalli
> > > > <Honnappa.Nagarahalli@arm.com>; Ruifeng Wang
> > > <Ruifeng.Wang@arm.com>;
> > > > Dharmik Thakkar <Dharmik.Thakkar@arm.com>; nd <nd@arm.com>
> > > > Subject: [dpdk-dev] [PATCH 3/3] eventdev: relax smp barriers with
> > > > c11 atomics
> > > >
> > > > The implementation-specific opaque data is shared between arm and
> > > > cancel operations. The state flag acts as a guard variable to make
> > > > sure the update of opaque data is synchronized. This patch uses
> > > > c11 atomics with explicit one way memory barrier instead of full
> > > > barriers
> > > > rte_smp_w/rmb() to synchronize the opaque data between timer arm
> > and
> > > cancel threads.
> > > >
> > > > Signed-off-by: Phil Yang <phil.yang@arm.com>
> > > > Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
> > > > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > > > ---
> > > >  lib/librte_eventdev/rte_event_timer_adapter.c | 55
> > > > ++++++++++++++++++---------
> > > >  lib/librte_eventdev/rte_event_timer_adapter.h |  2 +-
> > > >  2 files changed, 38 insertions(+), 19 deletions(-)
> > > >
> > > > diff --git a/lib/librte_eventdev/rte_event_timer_adapter.c
> > > > b/lib/librte_eventdev/rte_event_timer_adapter.c
> > > > index 6947efb..0a26501 100644
> > > > --- a/lib/librte_eventdev/rte_event_timer_adapter.c
> > > > +++ b/lib/librte_eventdev/rte_event_timer_adapter.c
> > > > @@ -629,7 +629,8 @@ swtim_callback(struct rte_timer *tim)
> > > >  		sw->expired_timers[sw->n_expired_timers++] = tim;
> > > >  		sw->stats.evtim_exp_count++;
> > > >
> > > > -		evtim->state = RTE_EVENT_TIMER_NOT_ARMED;
> > > > +		__atomic_store_n(&evtim->state,
> > > > RTE_EVENT_TIMER_NOT_ARMED,
> > > > +				 __ATOMIC_RELEASE);
> > > >  	}
> > > >
> > > >  	if (event_buffer_batch_ready(&sw->buffer)) { @@ -1020,6 +1021,7
> > > @@
> > > > __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
> > > >  	int n_lcores;
> > > >  	/* Timer is not armed state */
> > > >  	int16_t exp_state = 0;
> > > > +	enum rte_event_timer_state n_state;
> > > >
> > > >  #ifdef RTE_LIBRTE_EVENTDEV_DEBUG
> > > >  	/* Check that the service is running. */ @@ -1060,30 +1062,36 @@
> > > > __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
> > > >  	}
> > > >
> > > >  	for (i = 0; i < nb_evtims; i++) {
> > > > -		/* Don't modify the event timer state in these cases */
> > > > -		if (evtims[i]->state == RTE_EVENT_TIMER_ARMED) {
> > > > +		n_state = __atomic_load_n(&evtims[i]->state,
> > > > __ATOMIC_RELAXED);
> > > > +		if (n_state == RTE_EVENT_TIMER_ARMED) {
> > > >  			rte_errno = EALREADY;
> > > >  			break;
> > > > -		} else if (!(evtims[i]->state ==
> > > > RTE_EVENT_TIMER_NOT_ARMED ||
> > > > -			     evtims[i]->state ==
> > > > RTE_EVENT_TIMER_CANCELED)) {
> > > > +		} else if (!(n_state == RTE_EVENT_TIMER_NOT_ARMED ||
> > > > +			     n_state == RTE_EVENT_TIMER_CANCELED)) {
> > > >  			rte_errno = EINVAL;
> > > >  			break;
> > > >  		}
> > > >
> > > >  		ret = check_timeout(evtims[i], adapter);
> > > >  		if (unlikely(ret == -1)) {
> > > > -			evtims[i]->state =
> > > > RTE_EVENT_TIMER_ERROR_TOOLATE;
> > > > +			__atomic_store_n(&evtims[i]->state,
> > > > +
> > > 	RTE_EVENT_TIMER_ERROR_TOOLATE,
> > > > +					__ATOMIC_RELAXED);
> > > >  			rte_errno = EINVAL;
> > > >  			break;
> > > >  		} else if (unlikely(ret == -2)) {
> > > > -			evtims[i]->state =
> > > > RTE_EVENT_TIMER_ERROR_TOOEARLY;
> > > > +			__atomic_store_n(&evtims[i]->state,
> > > > +
> > > > 	RTE_EVENT_TIMER_ERROR_TOOEARLY,
> > > > +					__ATOMIC_RELAXED);
> > > >  			rte_errno = EINVAL;
> > > >  			break;
> > > >  		}
> > > >
> > > >  		if (unlikely(check_destination_event_queue(evtims[i],
> > > >  							   adapter) < 0)) {
> > > > -			evtims[i]->state = RTE_EVENT_TIMER_ERROR;
> > > > +			__atomic_store_n(&evtims[i]->state,
> > > > +					RTE_EVENT_TIMER_ERROR,
> > > > +					__ATOMIC_RELAXED);
> > > >  			rte_errno = EINVAL;
> > > >  			break;
> > > >  		}
> > > > @@ -1099,13 +1107,18 @@ __swtim_arm_burst(const struct
> > > > rte_event_timer_adapter *adapter,
> > > >  					  SINGLE, lcore_id, NULL, evtims[i]);
> > > >  		if (ret < 0) {
> > > >  			/* tim was in RUNNING or CONFIG state */
> > > > -			evtims[i]->state = RTE_EVENT_TIMER_ERROR;
> > > > +			__atomic_store_n(&evtims[i]->state,
> > > > +					RTE_EVENT_TIMER_ERROR,
> > > > +					__ATOMIC_RELEASE);
> > > >  			break;
> > > >  		}
> > > >
> > > > -		rte_smp_wmb();
> > > >  		EVTIM_LOG_DBG("armed an event timer");
> > > > -		evtims[i]->state = RTE_EVENT_TIMER_ARMED;
> > > > +		/* RELEASE ordering guarantees the adapter specific value
> > > > +		 * changes observed before the update of state.
> > > > +		 */
> > > > +		__atomic_store_n(&evtims[i]->state,
> > > > RTE_EVENT_TIMER_ARMED,
> > > > +				__ATOMIC_RELEASE);
> > > >  	}
> > > >
> > > >  	if (i < nb_evtims)
> > > > @@ -1132,6 +1145,7 @@ swtim_cancel_burst(const struct
> > > > rte_event_timer_adapter *adapter,
> > > >  	struct rte_timer *timp;
> > > >  	uint64_t opaque;
> > > >  	struct swtim *sw = swtim_pmd_priv(adapter);
> > > > +	enum rte_event_timer_state n_state;
> > > >
> > > >  #ifdef RTE_LIBRTE_EVENTDEV_DEBUG
> > > >  	/* Check that the service is running. */ @@ -1143,16 +1157,18 @@
> > > > swtim_cancel_burst(const struct rte_event_timer_adapter *adapter,
> > > >
> > > >  	for (i = 0; i < nb_evtims; i++) {
> > > >  		/* Don't modify the event timer state in these cases */
> > > > -		if (evtims[i]->state == RTE_EVENT_TIMER_CANCELED) {
> > > > +		/* ACQUIRE ordering guarantees the access of
> > > > implementation
> > > > +		 * specific opague data under the correct state.
> > > > +		 */
> > > > +		n_state = __atomic_load_n(&evtims[i]->state,
> > > > __ATOMIC_ACQUIRE);
> > > > +		if (n_state == RTE_EVENT_TIMER_CANCELED) {
> > > >  			rte_errno = EALREADY;
> > > >  			break;
> > > > -		} else if (evtims[i]->state != RTE_EVENT_TIMER_ARMED) {
> > > > +		} else if (n_state != RTE_EVENT_TIMER_ARMED) {
> > > >  			rte_errno = EINVAL;
> > > >  			break;
> > > >  		}
> > > >
> > > > -		rte_smp_rmb();
> > > > -
> > > >  		opaque = evtims[i]->impl_opaque[0];
> > > >  		timp = (struct rte_timer *)(uintptr_t)opaque;
> > > >  		RTE_ASSERT(timp != NULL);
> > > > @@ -1166,11 +1182,14 @@ swtim_cancel_burst(const struct
> > > > rte_event_timer_adapter *adapter,
> > > >
> > > >  		rte_mempool_put(sw->tim_pool, (void **)timp);
> > > >
> > > > -		evtims[i]->state = RTE_EVENT_TIMER_CANCELED;
> > > > +		__atomic_store_n(&evtims[i]->state,
> > > > RTE_EVENT_TIMER_CANCELED,
> > > > +				__ATOMIC_RELAXED);
> > > >  		evtims[i]->impl_opaque[0] = 0;
> > > >  		evtims[i]->impl_opaque[1] = 0;
> > >
> > > Is that safe to remove impl_opaque cleanup above?
> > >
> > > Once the soft timer canceled, the __swtim_arm_burst cannot access
> > > these two fields under the RTE_EVENT_TIMER_CANCELED state.
> > > After new timer armed, it refills these two fields in the
> > > __swtim_arm_burst thread, which is the only producer of these two
> fields.
> > > I think the only risk is that the values of these two field might be
> > > unknow after swtim_cancel_burst.
> > > So it should be safe to remove them if no other thread access them
> > > after canceling the timer.
> > >
> > > Any comments on this?
> > > If we remove these two instructions, we can also remove the
> > > __atomic_thread_fence below to save performance penalty.
> > >
> > > Thanks,
> > > Phil
> > >
> >
> > In this case, I see the fence as (more importantly) ensuring the state
> update
> > is visible to other threads... do I misunderstand?   I suppose we could also
> > update the state with an __atomic_store(..., __ATOMIC_RELEASE), but
> > perhaps that roughly equivalent?
> 
> Yeah. In my understanding, the fence ensures the state and the
> implementation-specific opaque data update are visible between other
> timer arm and cancel threads.
> Actually, we only care about the state's value here.
> The atomic RELEASE can also make sure all writes in the current thread are
> visible in other threads that acquire the same atomic variable.
> So I think we can remove the fence and update the state with RELEASE then
> load the state with ACQUIRE in the timer arm and the cancel threads to
> achieve the same goal.

Ok, that sounds good to me.

Thanks,
Erik

> 
> >
> > > > -
> > > > -		rte_smp_wmb();
> > > > +		/* The RELEASE fence make sure the clean up
> > > > +		 * of opaque data observed between threads.
> > > > +		 */
> > > > +		__atomic_thread_fence(__ATOMIC_RELEASE);
> > > >  	}
> > > >
> > > >  	return i;
> > > > diff --git a/lib/librte_eventdev/rte_event_timer_adapter.h
> > > > b/lib/librte_eventdev/rte_event_timer_adapter.h
> > > > index d2ebcb0..6f64b90 100644
> > > > --- a/lib/librte_eventdev/rte_event_timer_adapter.h
> > > > +++ b/lib/librte_eventdev/rte_event_timer_adapter.h
> > > > @@ -467,7 +467,7 @@ struct rte_event_timer {
> > > >  	 *  - op: RTE_EVENT_OP_NEW
> > > >  	 *  - event_type: RTE_EVENT_TYPE_TIMER
> > > >  	 */
> > > > -	volatile enum rte_event_timer_state state;
> > > > +	enum rte_event_timer_state state;
> > > >  	/**< State of the event timer. */
> > > >  	uint64_t timeout_ticks;
> > > >  	/**< Expiry timer ticks expressed in number of *timer_ticks_ns*
> > > from
> > > > --
> > > > 2.7.4


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH 1/3] eventdev: fix race condition on timer list counter
  2020-06-22  9:48     ` Phil Yang
@ 2020-07-01 11:22       ` Jerin Jacob
  2020-07-02  3:28         ` Phil Yang
  0 siblings, 1 reply; 45+ messages in thread
From: Jerin Jacob @ 2020-07-01 11:22 UTC (permalink / raw)
  To: Phil Yang
  Cc: Honnappa Nagarahalli, Carrillo, Erik G, dev, drc, Ruifeng Wang,
	Dharmik Thakkar, nd, stable

On Mon, Jun 22, 2020 at 3:18 PM Phil Yang <Phil.Yang@arm.com> wrote:
>
> > -----Original Message-----
> > From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> > Sent: Friday, June 19, 2020 2:26 AM
> > To: Carrillo, Erik G <erik.g.carrillo@intel.com>; Phil Yang
> > <Phil.Yang@arm.com>; dev@dpdk.org
> > Cc: drc@linux.vnet.ibm.com; Ruifeng Wang <Ruifeng.Wang@arm.com>;
> > Dharmik Thakkar <Dharmik.Thakkar@arm.com>; nd <nd@arm.com>;
> > stable@dpdk.org; Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>;
> > nd <nd@arm.com>
> > Subject: RE: [PATCH 1/3] eventdev: fix race condition on timer list counter
> >

> > Since this commit will be back ported, we should prefer to use rte_atomic
> > APIs for this commit. Otherwise, we will have a mix of rte_atomic and C11
> > APIs.
> > My suggestion is to fix this bug using rte_atomic so that backported code will
> > have only rte_atomic APIs. Add another commit (if required) in this series to
> > make the bug fix use C11 APIs (this commit will not be backported).
>
> Agree.
> I will change this patch to the rte_atomic version in the next version.

Hi Phil,

Could you send the next version? I would like to take this series for
RC1(next-eventdev tree)


>
> Thanks,
> Phil
>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH 1/3] eventdev: fix race condition on timer list counter
  2020-06-18 18:25   ` Honnappa Nagarahalli
  2020-06-22  9:48     ` Phil Yang
@ 2020-07-02  3:26     ` Phil Yang
  2020-07-02  3:56       ` Honnappa Nagarahalli
  1 sibling, 1 reply; 45+ messages in thread
From: Phil Yang @ 2020-07-02  3:26 UTC (permalink / raw)
  To: Honnappa Nagarahalli, Carrillo, Erik G, dev
  Cc: drc, Ruifeng Wang, Dharmik Thakkar, nd, stable, nd, jerinj

> -----Original Message-----
> From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> Sent: Friday, June 19, 2020 2:26 AM
> To: Carrillo, Erik G <erik.g.carrillo@intel.com>; Phil Yang
> <Phil.Yang@arm.com>; dev@dpdk.org
> Cc: drc@linux.vnet.ibm.com; Ruifeng Wang <Ruifeng.Wang@arm.com>;
> Dharmik Thakkar <Dharmik.Thakkar@arm.com>; nd <nd@arm.com>;
> stable@dpdk.org; Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>;
> nd <nd@arm.com>
> Subject: RE: [PATCH 1/3] eventdev: fix race condition on timer list counter
> 
> <snip>
> 
> >
> > Hi Phil,
> >
> > Good catch - thanks for the fix.   I've commented in-line:
> >
> > > -----Original Message-----
> > > From: Phil Yang <phil.yang@arm.com>
> > > Sent: Friday, June 12, 2020 6:20 AM
> > > To: dev@dpdk.org; Carrillo, Erik G <erik.g.carrillo@intel.com>
> > > Cc: drc@linux.vnet.ibm.com; honnappa.nagarahalli@arm.com;
> > > ruifeng.wang@arm.com; dharmik.thakkar@arm.com; nd@arm.com;
> > > stable@dpdk.org
> > > Subject: [PATCH 1/3] eventdev: fix race condition on timer list
> > > counter
> > >
> > > The n_poll_lcores counter and poll_lcore array are shared between
> > > lcores and the update of these variables are out of the protection of
> > > spinlock on each lcore timer list. The read-modify-write operations of
> > > the counter are not atomic, so it has the potential of race condition
> > between lcores.
> > >
> > > Use c11 atomics with RELAXED ordering to prevent confliction.
> > >
> > > Fixes: cc7b73ea9e3b ("eventdev: add new software timer adapter")
> > > Cc: erik.g.carrillo@intel.com
> > > Cc: stable@dpdk.org
> > >
> > > Signed-off-by: Phil Yang <phil.yang@arm.com>
> > > Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
> > > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > > ---
> > >  lib/librte_eventdev/rte_event_timer_adapter.c | 16 ++++++++++++----
> > >  1 file changed, 12 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/lib/librte_eventdev/rte_event_timer_adapter.c
> > > b/lib/librte_eventdev/rte_event_timer_adapter.c
> > > index 005459f..6a0e283 100644
> > > --- a/lib/librte_eventdev/rte_event_timer_adapter.c
> > > +++ b/lib/librte_eventdev/rte_event_timer_adapter.c
> > > @@ -583,6 +583,7 @@ swtim_callback(struct rte_timer *tim)
> > >  uint16_t nb_evs_invalid = 0;
> > >  uint64_t opaque;
> > >  int ret;
> > > +int n_lcores;
> > >
> > >  opaque = evtim->impl_opaque[1];
> > >  adapter = (struct rte_event_timer_adapter *)(uintptr_t)opaque; @@
> > > -605,8 +606,12 @@ swtim_callback(struct rte_timer *tim)
> > >        "with immediate expiry value");
> > >  }
> > >
> > > -if (unlikely(rte_atomic16_test_and_set(&sw-
> > > >in_use[lcore].v)))
> > > -sw->poll_lcores[sw->n_poll_lcores++] = lcore;
> > > +if (unlikely(rte_atomic16_test_and_set(&sw-
> > > >in_use[lcore].v))) {
> > > +n_lcores = __atomic_fetch_add(&sw->n_poll_lcores,
> > > 1,
> > > +__ATOMIC_RELAXED);
> Since this commit will be back ported, we should prefer to use rte_atomic
> APIs for this commit. Otherwise, we will have a mix of rte_atomic and C11
> APIs.
> My suggestion is to fix this bug using rte_atomic so that backported code will
> have only rte_atomic APIs. Add another commit (if required) in this series to
> make the bug fix use C11 APIs (this commit will not be backported).

Hi Honnappa,

It doesn't have an applicable rte_atomic_XXX API to fix this issue.
The rte_atomic32_inc doesn't return the original value of the input parameter and rte_atomic32_add_return can only return the new value.

Meanwhile, the rte_timer_alt_manage & rte_timer_stop_all API not support rte_atomic type parameters. We might need to rewrite these two APIs if we want to use rte_atomic operations for n_pol_lcores and poll_lcores array.

So, a better solution could be to backport the entire c11 solution to stable releases.

Thanks,
Phil

> 
> >
> > Just a nit, but let's align the continued line with the opening parentheses in
> > this location and below.  With these changes:
> >
> > Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
> >
> > > +__atomic_store_n(&sw->poll_lcores[n_lcores],
> > > lcore,
> > > +__ATOMIC_RELAXED);
> > > +}
> > >  } else {
> > >  EVTIM_BUF_LOG_DBG("buffered an event timer expiry event");
> > >
> > > @@ -1011,6 +1016,7 @@ __swtim_arm_burst(const struct
> > > rte_event_timer_adapter *adapter,
> > >  uint32_t lcore_id = rte_lcore_id();
> > >  struct rte_timer *tim, *tims[nb_evtims];
> > >  uint64_t cycles;
> > > +int n_lcores;
> > >
> > >  #ifdef RTE_LIBRTE_EVENTDEV_DEBUG
> > >  /* Check that the service is running. */ @@ -1033,8 +1039,10 @@
> > > __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
> > >  if (unlikely(rte_atomic16_test_and_set(&sw->in_use[lcore_id].v))) {
> > >  EVTIM_LOG_DBG("Adding lcore id = %u to list of lcores to
> > poll",
> > >        lcore_id);
> > > -sw->poll_lcores[sw->n_poll_lcores] = lcore_id;
> > > -++sw->n_poll_lcores;
> > > +n_lcores = __atomic_fetch_add(&sw->n_poll_lcores, 1,
> > > +__ATOMIC_RELAXED);
> > > +__atomic_store_n(&sw->poll_lcores[n_lcores], lcore_id,
> > > +__ATOMIC_RELAXED);
> > >  }
> > >
> > >  ret = rte_mempool_get_bulk(sw->tim_pool, (void **)tims,
> > > --
> > > 2.7.4
> 


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH 1/3] eventdev: fix race condition on timer list counter
  2020-07-01 11:22       ` Jerin Jacob
@ 2020-07-02  3:28         ` Phil Yang
  0 siblings, 0 replies; 45+ messages in thread
From: Phil Yang @ 2020-07-02  3:28 UTC (permalink / raw)
  To: Jerin Jacob
  Cc: Honnappa Nagarahalli, Carrillo, Erik G, dev, drc, Ruifeng Wang,
	Dharmik Thakkar, nd, stable

> -----Original Message-----
> From: Jerin Jacob <jerinjacobk@gmail.com>
> Sent: Wednesday, July 1, 2020 7:22 PM
> To: Phil Yang <Phil.Yang@arm.com>
> Cc: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; Carrillo, Erik G
> <erik.g.carrillo@intel.com>; dev@dpdk.org; drc@linux.vnet.ibm.com;
> Ruifeng Wang <Ruifeng.Wang@arm.com>; Dharmik Thakkar
> <Dharmik.Thakkar@arm.com>; nd <nd@arm.com>; stable@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 1/3] eventdev: fix race condition on timer list
> counter
> 
> On Mon, Jun 22, 2020 at 3:18 PM Phil Yang <Phil.Yang@arm.com> wrote:
> >
> > > -----Original Message-----
> > > From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> > > Sent: Friday, June 19, 2020 2:26 AM
> > > To: Carrillo, Erik G <erik.g.carrillo@intel.com>; Phil Yang
> > > <Phil.Yang@arm.com>; dev@dpdk.org
> > > Cc: drc@linux.vnet.ibm.com; Ruifeng Wang <Ruifeng.Wang@arm.com>;
> > > Dharmik Thakkar <Dharmik.Thakkar@arm.com>; nd <nd@arm.com>;
> > > stable@dpdk.org; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>;
> > > nd <nd@arm.com>
> > > Subject: RE: [PATCH 1/3] eventdev: fix race condition on timer list counter
> > >
> 
> > > Since this commit will be back ported, we should prefer to use rte_atomic
> > > APIs for this commit. Otherwise, we will have a mix of rte_atomic and C11
> > > APIs.
> > > My suggestion is to fix this bug using rte_atomic so that backported code
> will
> > > have only rte_atomic APIs. Add another commit (if required) in this series
> to
> > > make the bug fix use C11 APIs (this commit will not be backported).
> >
> > Agree.
> > I will change this patch to the rte_atomic version in the next version.
> 
> Hi Phil,
> 
> Could you send the next version? I would like to take this series for
> RC1(next-eventdev tree)

Thanks, Jerin.
I will upstream the new patch series soon.


> 
> 
> >
> > Thanks,
> > Phil
> >

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH 1/3] eventdev: fix race condition on timer list counter
  2020-07-02  3:26     ` Phil Yang
@ 2020-07-02  3:56       ` Honnappa Nagarahalli
  2020-07-02 21:15         ` Carrillo, Erik G
  0 siblings, 1 reply; 45+ messages in thread
From: Honnappa Nagarahalli @ 2020-07-02  3:56 UTC (permalink / raw)
  To: Phil Yang, Carrillo, Erik G, dev
  Cc: drc, Ruifeng Wang, Dharmik Thakkar, stable, nd, jerinj,
	Honnappa Nagarahalli, nd

<snip>
> >
> > >
> > > Hi Phil,
> > >
> > > Good catch - thanks for the fix.   I've commented in-line:
> > >
> > > > -----Original Message-----
> > > > From: Phil Yang <phil.yang@arm.com>
> > > > Sent: Friday, June 12, 2020 6:20 AM
> > > > To: dev@dpdk.org; Carrillo, Erik G <erik.g.carrillo@intel.com>
> > > > Cc: drc@linux.vnet.ibm.com; honnappa.nagarahalli@arm.com;
> > > > ruifeng.wang@arm.com; dharmik.thakkar@arm.com; nd@arm.com;
> > > > stable@dpdk.org
> > > > Subject: [PATCH 1/3] eventdev: fix race condition on timer list
> > > > counter
> > > >
> > > > The n_poll_lcores counter and poll_lcore array are shared between
> > > > lcores and the update of these variables are out of the protection
> > > > of spinlock on each lcore timer list. The read-modify-write
> > > > operations of the counter are not atomic, so it has the potential
> > > > of race condition
> > > between lcores.
> > > >
> > > > Use c11 atomics with RELAXED ordering to prevent confliction.
> > > >
> > > > Fixes: cc7b73ea9e3b ("eventdev: add new software timer adapter")
> > > > Cc: erik.g.carrillo@intel.com
> > > > Cc: stable@dpdk.org
> > > >
> > > > Signed-off-by: Phil Yang <phil.yang@arm.com>
> > > > Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
> > > > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > > > ---
> > > >  lib/librte_eventdev/rte_event_timer_adapter.c | 16
> > > > ++++++++++++----
> > > >  1 file changed, 12 insertions(+), 4 deletions(-)
> > > >
> > > > diff --git a/lib/librte_eventdev/rte_event_timer_adapter.c
> > > > b/lib/librte_eventdev/rte_event_timer_adapter.c
> > > > index 005459f..6a0e283 100644
> > > > --- a/lib/librte_eventdev/rte_event_timer_adapter.c
> > > > +++ b/lib/librte_eventdev/rte_event_timer_adapter.c
> > > > @@ -583,6 +583,7 @@ swtim_callback(struct rte_timer *tim)
> > > > uint16_t nb_evs_invalid = 0;  uint64_t opaque;  int ret;
> > > > +int n_lcores;
> > > >
> > > >  opaque = evtim->impl_opaque[1];
> > > >  adapter = (struct rte_event_timer_adapter *)(uintptr_t)opaque; @@
> > > > -605,8 +606,12 @@ swtim_callback(struct rte_timer *tim)
> > > >        "with immediate expiry value");  }
> > > >
> > > > -if (unlikely(rte_atomic16_test_and_set(&sw-
> > > > >in_use[lcore].v)))
> > > > -sw->poll_lcores[sw->n_poll_lcores++] = lcore;
> > > > +if (unlikely(rte_atomic16_test_and_set(&sw-
> > > > >in_use[lcore].v))) {
> > > > +n_lcores = __atomic_fetch_add(&sw->n_poll_lcores,
> > > > 1,
> > > > +__ATOMIC_RELAXED);
> > Since this commit will be back ported, we should prefer to use
> > rte_atomic APIs for this commit. Otherwise, we will have a mix of
> > rte_atomic and C11 APIs.
> > My suggestion is to fix this bug using rte_atomic so that backported
> > code will have only rte_atomic APIs. Add another commit (if required)
> > in this series to make the bug fix use C11 APIs (this commit will not be
> backported).
> 
> Hi Honnappa,
> 
> It doesn't have an applicable rte_atomic_XXX API to fix this issue.
> The rte_atomic32_inc doesn't return the original value of the input parameter
> and rte_atomic32_add_return can only return the new value.
Ok, understood.

> 
> Meanwhile, the rte_timer_alt_manage & rte_timer_stop_all API not support
> rte_atomic type parameters. We might need to rewrite these two APIs if we
> want to use rte_atomic operations for n_pol_lcores and poll_lcores array.
> 
> So, a better solution could be to backport the entire c11 solution to stable
> releases.
I am ok with the approach.
Erik, are you ok with this?

> 
> Thanks,
> Phil
> 
> >
> > >
> > > Just a nit, but let's align the continued line with the opening
> > > parentheses in this location and below.  With these changes:
> > >
> > > Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
> > >
> > > > +__atomic_store_n(&sw->poll_lcores[n_lcores],
> > > > lcore,
> > > > +__ATOMIC_RELAXED);
> > > > +}
> > > >  } else {
> > > >  EVTIM_BUF_LOG_DBG("buffered an event timer expiry event");
> > > >
> > > > @@ -1011,6 +1016,7 @@ __swtim_arm_burst(const struct
> > > > rte_event_timer_adapter *adapter,  uint32_t lcore_id =
> > > > rte_lcore_id();  struct rte_timer *tim, *tims[nb_evtims];
> > > > uint64_t cycles;
> > > > +int n_lcores;
> > > >
> > > >  #ifdef RTE_LIBRTE_EVENTDEV_DEBUG
> > > >  /* Check that the service is running. */ @@ -1033,8 +1039,10 @@
> > > > __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
> > > > if (unlikely(rte_atomic16_test_and_set(&sw->in_use[lcore_id].v)))
> > > > {  EVTIM_LOG_DBG("Adding lcore id = %u to list of lcores to
> > > poll",
> > > >        lcore_id);
> > > > -sw->poll_lcores[sw->n_poll_lcores] = lcore_id;
> > > > -++sw->n_poll_lcores;
> > > > +n_lcores = __atomic_fetch_add(&sw->n_poll_lcores, 1,
> > > > +__ATOMIC_RELAXED); __atomic_store_n(&sw->poll_lcores[n_lcores],
> > > > +lcore_id, __ATOMIC_RELAXED);
> > > >  }
> > > >
> > > >  ret = rte_mempool_get_bulk(sw->tim_pool, (void **)tims,
> > > > --
> > > > 2.7.4
> >


^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH v2 1/4] eventdev: fix race condition on timer list counter
  2020-06-12 11:19 [dpdk-dev] [PATCH 1/3] eventdev: fix race condition on timer list counter Phil Yang
                   ` (2 preceding siblings ...)
  2020-06-18 15:17 ` [dpdk-dev] [PATCH 1/3] eventdev: fix race condition on timer list counter Carrillo, Erik G
@ 2020-07-02  5:26 ` Phil Yang
  2020-07-02  5:26   ` [dpdk-dev] [PATCH v2 2/4] eventdev: use c11 atomics for lcore timer armed flag Phil Yang
                     ` (3 more replies)
  3 siblings, 4 replies; 45+ messages in thread
From: Phil Yang @ 2020-07-02  5:26 UTC (permalink / raw)
  To: erik.g.carrillo, dev
  Cc: jerinj, Honnappa.Nagarahalli, drc, Ruifeng.Wang, Dharmik.Thakkar,
	nd, stable

The n_poll_lcores counter and poll_lcore array are shared between lcores
and the update of these variables are out of the protection of spinlock
on each lcore timer list. The read-modify-write operations of the counter
are not atomic, so it has the potential of race condition between lcores.

Use c11 atomics with RELAXED ordering to prevent confliction.

Fixes: cc7b73ea9e3b ("eventdev: add new software timer adapter")
Cc: erik.g.carrillo@intel.com
Cc: stable@dpdk.org

Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
---
v2:
Align the code. (Erik)

 lib/librte_eventdev/rte_event_timer_adapter.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/lib/librte_eventdev/rte_event_timer_adapter.c b/lib/librte_eventdev/rte_event_timer_adapter.c
index 005459f..a36d3d2 100644
--- a/lib/librte_eventdev/rte_event_timer_adapter.c
+++ b/lib/librte_eventdev/rte_event_timer_adapter.c
@@ -583,6 +583,7 @@ swtim_callback(struct rte_timer *tim)
 	uint16_t nb_evs_invalid = 0;
 	uint64_t opaque;
 	int ret;
+	int n_lcores;
 
 	opaque = evtim->impl_opaque[1];
 	adapter = (struct rte_event_timer_adapter *)(uintptr_t)opaque;
@@ -605,8 +606,12 @@ swtim_callback(struct rte_timer *tim)
 				      "with immediate expiry value");
 		}
 
-		if (unlikely(rte_atomic16_test_and_set(&sw->in_use[lcore].v)))
-			sw->poll_lcores[sw->n_poll_lcores++] = lcore;
+		if (unlikely(rte_atomic16_test_and_set(&sw->in_use[lcore].v))) {
+			n_lcores = __atomic_fetch_add(&sw->n_poll_lcores, 1,
+						     __ATOMIC_RELAXED);
+			__atomic_store_n(&sw->poll_lcores[n_lcores], lcore,
+					__ATOMIC_RELAXED);
+		}
 	} else {
 		EVTIM_BUF_LOG_DBG("buffered an event timer expiry event");
 
@@ -1011,6 +1016,7 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
 	uint32_t lcore_id = rte_lcore_id();
 	struct rte_timer *tim, *tims[nb_evtims];
 	uint64_t cycles;
+	int n_lcores;
 
 #ifdef RTE_LIBRTE_EVENTDEV_DEBUG
 	/* Check that the service is running. */
@@ -1033,8 +1039,10 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
 	if (unlikely(rte_atomic16_test_and_set(&sw->in_use[lcore_id].v))) {
 		EVTIM_LOG_DBG("Adding lcore id = %u to list of lcores to poll",
 			      lcore_id);
-		sw->poll_lcores[sw->n_poll_lcores] = lcore_id;
-		++sw->n_poll_lcores;
+		n_lcores = __atomic_fetch_add(&sw->n_poll_lcores, 1,
+					     __ATOMIC_RELAXED);
+		__atomic_store_n(&sw->poll_lcores[n_lcores], lcore_id,
+				__ATOMIC_RELAXED);
 	}
 
 	ret = rte_mempool_get_bulk(sw->tim_pool, (void **)tims,
-- 
2.7.4


^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH v2 2/4] eventdev: use c11 atomics for lcore timer armed flag
  2020-07-02  5:26 ` [dpdk-dev] [PATCH v2 1/4] " Phil Yang
@ 2020-07-02  5:26   ` Phil Yang
  2020-07-02 20:21     ` Carrillo, Erik G
  2020-07-02  5:26   ` [dpdk-dev] [PATCH v2 3/4] eventdev: remove redundant code Phil Yang
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 45+ messages in thread
From: Phil Yang @ 2020-07-02  5:26 UTC (permalink / raw)
  To: erik.g.carrillo, dev
  Cc: jerinj, Honnappa.Nagarahalli, drc, Ruifeng.Wang, Dharmik.Thakkar, nd

The in_use flag is a per core variable which is not shared between
lcores in the normal case and the access of this variable should be
ordered on the same core. However, if non-EAL thread pick the highest
lcore to insert timers into, there is the possibility of conflicts
on this flag between threads. Then the atomic CAS operation is needed.

Use the c11 atomic CAS instead of the generic rte_atomic operations
to avoid the unnecessary barrier on aarch64.

Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
v2:
1. Make the code comments more accurate. (Erik)
2. Define the in_use flag as an unsigned type. (Stephen)

 lib/librte_eventdev/rte_event_timer_adapter.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/lib/librte_eventdev/rte_event_timer_adapter.c b/lib/librte_eventdev/rte_event_timer_adapter.c
index a36d3d2..7cc334c 100644
--- a/lib/librte_eventdev/rte_event_timer_adapter.c
+++ b/lib/librte_eventdev/rte_event_timer_adapter.c
@@ -554,7 +554,7 @@ struct swtim {
 	uint32_t timer_data_id;
 	/* Track which cores have actually armed a timer */
 	struct {
-		rte_atomic16_t v;
+		uint16_t v;
 	} __rte_cache_aligned in_use[RTE_MAX_LCORE];
 	/* Track which cores' timer lists should be polled */
 	unsigned int poll_lcores[RTE_MAX_LCORE];
@@ -606,7 +606,8 @@ swtim_callback(struct rte_timer *tim)
 				      "with immediate expiry value");
 		}
 
-		if (unlikely(rte_atomic16_test_and_set(&sw->in_use[lcore].v))) {
+		if (unlikely(sw->in_use[lcore].v == 0)) {
+			sw->in_use[lcore].v = 1;
 			n_lcores = __atomic_fetch_add(&sw->n_poll_lcores, 1,
 						     __ATOMIC_RELAXED);
 			__atomic_store_n(&sw->poll_lcores[n_lcores], lcore,
@@ -834,7 +835,7 @@ swtim_init(struct rte_event_timer_adapter *adapter)
 
 	/* Initialize the variables that track in-use timer lists */
 	for (i = 0; i < RTE_MAX_LCORE; i++)
-		rte_atomic16_init(&sw->in_use[i].v);
+		sw->in_use[i].v = 0;
 
 	/* Initialize the timer subsystem and allocate timer data instance */
 	ret = rte_timer_subsystem_init();
@@ -1017,6 +1018,8 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
 	struct rte_timer *tim, *tims[nb_evtims];
 	uint64_t cycles;
 	int n_lcores;
+	/* Timer list for this lcore is not in use. */
+	uint16_t exp_state = 0;
 
 #ifdef RTE_LIBRTE_EVENTDEV_DEBUG
 	/* Check that the service is running. */
@@ -1035,8 +1038,12 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
 	/* If this is the first time we're arming an event timer on this lcore,
 	 * mark this lcore as "in use"; this will cause the service
 	 * function to process the timer list that corresponds to this lcore.
+	 * The atomic CAS operation can prevent the race condition on in_use
+	 * flag between multiple non-EAL threads.
 	 */
-	if (unlikely(rte_atomic16_test_and_set(&sw->in_use[lcore_id].v))) {
+	if (unlikely(__atomic_compare_exchange_n(&sw->in_use[lcore_id].v,
+			&exp_state, 1, 0,
+			__ATOMIC_RELAXED, __ATOMIC_RELAXED))) {
 		EVTIM_LOG_DBG("Adding lcore id = %u to list of lcores to poll",
 			      lcore_id);
 		n_lcores = __atomic_fetch_add(&sw->n_poll_lcores, 1,
-- 
2.7.4


^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH v2 3/4] eventdev: remove redundant code
  2020-07-02  5:26 ` [dpdk-dev] [PATCH v2 1/4] " Phil Yang
  2020-07-02  5:26   ` [dpdk-dev] [PATCH v2 2/4] eventdev: use c11 atomics for lcore timer armed flag Phil Yang
@ 2020-07-02  5:26   ` Phil Yang
  2020-07-03  3:35     ` Dharmik Thakkar
  2020-07-02  5:26   ` [dpdk-dev] [PATCH v2 4/4] eventdev: relax smp barriers with c11 atomics Phil Yang
  2020-07-07 11:13   ` [dpdk-dev] [PATCH v3 1/4] eventdev: fix race condition on timer list counter Phil Yang
  3 siblings, 1 reply; 45+ messages in thread
From: Phil Yang @ 2020-07-02  5:26 UTC (permalink / raw)
  To: erik.g.carrillo, dev
  Cc: jerinj, Honnappa.Nagarahalli, drc, Ruifeng.Wang, Dharmik.Thakkar, nd

There is no thread will access these impl_opaque data after timer
canceled. When new timer armed, it got refilled. So the cleanup
process is unnecessary.

Signed-off-by: Phil Yang <phil.yang@arm.com>
---
 lib/librte_eventdev/rte_event_timer_adapter.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/lib/librte_eventdev/rte_event_timer_adapter.c b/lib/librte_eventdev/rte_event_timer_adapter.c
index 7cc334c..8909a8c 100644
--- a/lib/librte_eventdev/rte_event_timer_adapter.c
+++ b/lib/librte_eventdev/rte_event_timer_adapter.c
@@ -1167,8 +1167,6 @@ swtim_cancel_burst(const struct rte_event_timer_adapter *adapter,
 		rte_mempool_put(sw->tim_pool, (void **)timp);
 
 		evtims[i]->state = RTE_EVENT_TIMER_CANCELED;
-		evtims[i]->impl_opaque[0] = 0;
-		evtims[i]->impl_opaque[1] = 0;
 
 		rte_smp_wmb();
 	}
-- 
2.7.4


^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH v2 4/4] eventdev: relax smp barriers with c11 atomics
  2020-07-02  5:26 ` [dpdk-dev] [PATCH v2 1/4] " Phil Yang
  2020-07-02  5:26   ` [dpdk-dev] [PATCH v2 2/4] eventdev: use c11 atomics for lcore timer armed flag Phil Yang
  2020-07-02  5:26   ` [dpdk-dev] [PATCH v2 3/4] eventdev: remove redundant code Phil Yang
@ 2020-07-02  5:26   ` Phil Yang
  2020-07-02 20:30     ` Carrillo, Erik G
  2020-07-06 10:04     ` Thomas Monjalon
  2020-07-07 11:13   ` [dpdk-dev] [PATCH v3 1/4] eventdev: fix race condition on timer list counter Phil Yang
  3 siblings, 2 replies; 45+ messages in thread
From: Phil Yang @ 2020-07-02  5:26 UTC (permalink / raw)
  To: erik.g.carrillo, dev
  Cc: jerinj, Honnappa.Nagarahalli, drc, Ruifeng.Wang, Dharmik.Thakkar, nd

The implementation-specific opaque data is shared between arm and cancel
operations. The state flag acts as a guard variable to make sure the
update of opaque data is synchronized. This patch uses c11 atomics with
explicit one way memory barrier instead of full barriers rte_smp_w/rmb()
to synchronize the opaque data between timer arm and cancel threads.

Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
v2:
1. Removed implementation-specific opaque data cleanup code.
2. Replaced thread fence with atomic ACQURE/RELEASE ordering on state access.

 lib/librte_eventdev/rte_event_timer_adapter.c | 55 ++++++++++++++++++---------
 lib/librte_eventdev/rte_event_timer_adapter.h |  2 +-
 2 files changed, 38 insertions(+), 19 deletions(-)

diff --git a/lib/librte_eventdev/rte_event_timer_adapter.c b/lib/librte_eventdev/rte_event_timer_adapter.c
index 8909a8c..ca00258 100644
--- a/lib/librte_eventdev/rte_event_timer_adapter.c
+++ b/lib/librte_eventdev/rte_event_timer_adapter.c
@@ -629,7 +629,8 @@ swtim_callback(struct rte_timer *tim)
 		sw->expired_timers[sw->n_expired_timers++] = tim;
 		sw->stats.evtim_exp_count++;
 
-		evtim->state = RTE_EVENT_TIMER_NOT_ARMED;
+		__atomic_store_n(&evtim->state, RTE_EVENT_TIMER_NOT_ARMED,
+				__ATOMIC_RELEASE);
 	}
 
 	if (event_buffer_batch_ready(&sw->buffer)) {
@@ -1020,6 +1021,7 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
 	int n_lcores;
 	/* Timer list for this lcore is not in use. */
 	uint16_t exp_state = 0;
+	enum rte_event_timer_state n_state;
 
 #ifdef RTE_LIBRTE_EVENTDEV_DEBUG
 	/* Check that the service is running. */
@@ -1060,30 +1062,36 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
 	}
 
 	for (i = 0; i < nb_evtims; i++) {
-		/* Don't modify the event timer state in these cases */
-		if (evtims[i]->state == RTE_EVENT_TIMER_ARMED) {
+		n_state = __atomic_load_n(&evtims[i]->state, __ATOMIC_ACQUIRE);
+		if (n_state == RTE_EVENT_TIMER_ARMED) {
 			rte_errno = EALREADY;
 			break;
-		} else if (!(evtims[i]->state == RTE_EVENT_TIMER_NOT_ARMED ||
-			     evtims[i]->state == RTE_EVENT_TIMER_CANCELED)) {
+		} else if (!(n_state == RTE_EVENT_TIMER_NOT_ARMED ||
+			     n_state == RTE_EVENT_TIMER_CANCELED)) {
 			rte_errno = EINVAL;
 			break;
 		}
 
 		ret = check_timeout(evtims[i], adapter);
 		if (unlikely(ret == -1)) {
-			evtims[i]->state = RTE_EVENT_TIMER_ERROR_TOOLATE;
+			__atomic_store_n(&evtims[i]->state,
+					RTE_EVENT_TIMER_ERROR_TOOLATE,
+					__ATOMIC_RELAXED);
 			rte_errno = EINVAL;
 			break;
 		} else if (unlikely(ret == -2)) {
-			evtims[i]->state = RTE_EVENT_TIMER_ERROR_TOOEARLY;
+			__atomic_store_n(&evtims[i]->state,
+					RTE_EVENT_TIMER_ERROR_TOOEARLY,
+					__ATOMIC_RELAXED);
 			rte_errno = EINVAL;
 			break;
 		}
 
 		if (unlikely(check_destination_event_queue(evtims[i],
 							   adapter) < 0)) {
-			evtims[i]->state = RTE_EVENT_TIMER_ERROR;
+			__atomic_store_n(&evtims[i]->state,
+					RTE_EVENT_TIMER_ERROR,
+					__ATOMIC_RELAXED);
 			rte_errno = EINVAL;
 			break;
 		}
@@ -1099,13 +1107,18 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
 					  SINGLE, lcore_id, NULL, evtims[i]);
 		if (ret < 0) {
 			/* tim was in RUNNING or CONFIG state */
-			evtims[i]->state = RTE_EVENT_TIMER_ERROR;
+			__atomic_store_n(&evtims[i]->state,
+					RTE_EVENT_TIMER_ERROR,
+					__ATOMIC_RELEASE);
 			break;
 		}
 
-		rte_smp_wmb();
 		EVTIM_LOG_DBG("armed an event timer");
-		evtims[i]->state = RTE_EVENT_TIMER_ARMED;
+		/* RELEASE ordering guarantees the adapter specific value
+		 * changes observed before the update of state.
+		 */
+		__atomic_store_n(&evtims[i]->state, RTE_EVENT_TIMER_ARMED,
+				__ATOMIC_RELEASE);
 	}
 
 	if (i < nb_evtims)
@@ -1132,6 +1145,7 @@ swtim_cancel_burst(const struct rte_event_timer_adapter *adapter,
 	struct rte_timer *timp;
 	uint64_t opaque;
 	struct swtim *sw = swtim_pmd_priv(adapter);
+	enum rte_event_timer_state n_state;
 
 #ifdef RTE_LIBRTE_EVENTDEV_DEBUG
 	/* Check that the service is running. */
@@ -1143,16 +1157,18 @@ swtim_cancel_burst(const struct rte_event_timer_adapter *adapter,
 
 	for (i = 0; i < nb_evtims; i++) {
 		/* Don't modify the event timer state in these cases */
-		if (evtims[i]->state == RTE_EVENT_TIMER_CANCELED) {
+		/* ACQUIRE ordering guarantees the access of implementation
+		 * specific opague data under the correct state.
+		 */
+		n_state = __atomic_load_n(&evtims[i]->state, __ATOMIC_ACQUIRE);
+		if (n_state == RTE_EVENT_TIMER_CANCELED) {
 			rte_errno = EALREADY;
 			break;
-		} else if (evtims[i]->state != RTE_EVENT_TIMER_ARMED) {
+		} else if (n_state != RTE_EVENT_TIMER_ARMED) {
 			rte_errno = EINVAL;
 			break;
 		}
 
-		rte_smp_rmb();
-
 		opaque = evtims[i]->impl_opaque[0];
 		timp = (struct rte_timer *)(uintptr_t)opaque;
 		RTE_ASSERT(timp != NULL);
@@ -1166,9 +1182,12 @@ swtim_cancel_burst(const struct rte_event_timer_adapter *adapter,
 
 		rte_mempool_put(sw->tim_pool, (void **)timp);
 
-		evtims[i]->state = RTE_EVENT_TIMER_CANCELED;
-
-		rte_smp_wmb();
+		/* The RELEASE ordering here pairs with atomic ordering
+		 * to make sure the state update data observed between
+		 * threads.
+		 */
+		__atomic_store_n(&evtims[i]->state, RTE_EVENT_TIMER_CANCELED,
+				__ATOMIC_RELEASE);
 	}
 
 	return i;
diff --git a/lib/librte_eventdev/rte_event_timer_adapter.h b/lib/librte_eventdev/rte_event_timer_adapter.h
index d2ebcb0..6f64b90 100644
--- a/lib/librte_eventdev/rte_event_timer_adapter.h
+++ b/lib/librte_eventdev/rte_event_timer_adapter.h
@@ -467,7 +467,7 @@ struct rte_event_timer {
 	 *  - op: RTE_EVENT_OP_NEW
 	 *  - event_type: RTE_EVENT_TYPE_TIMER
 	 */
-	volatile enum rte_event_timer_state state;
+	enum rte_event_timer_state state;
 	/**< State of the event timer. */
 	uint64_t timeout_ticks;
 	/**< Expiry timer ticks expressed in number of *timer_ticks_ns* from
-- 
2.7.4


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/4] eventdev: use c11 atomics for lcore timer armed flag
  2020-07-02  5:26   ` [dpdk-dev] [PATCH v2 2/4] eventdev: use c11 atomics for lcore timer armed flag Phil Yang
@ 2020-07-02 20:21     ` Carrillo, Erik G
  0 siblings, 0 replies; 45+ messages in thread
From: Carrillo, Erik G @ 2020-07-02 20:21 UTC (permalink / raw)
  To: Phil Yang, dev
  Cc: jerinj, Honnappa.Nagarahalli, drc, Ruifeng.Wang, Dharmik.Thakkar, nd

> -----Original Message-----
> From: Phil Yang <phil.yang@arm.com>
> Sent: Thursday, July 2, 2020 12:27 AM
> To: Carrillo, Erik G <erik.g.carrillo@intel.com>; dev@dpdk.org
> Cc: jerinj@marvell.com; Honnappa.Nagarahalli@arm.com;
> drc@linux.vnet.ibm.com; Ruifeng.Wang@arm.com;
> Dharmik.Thakkar@arm.com; nd@arm.com
> Subject: [PATCH v2 2/4] eventdev: use c11 atomics for lcore timer armed flag
> 
> The in_use flag is a per core variable which is not shared between lcores in
> the normal case and the access of this variable should be ordered on the
> same core. However, if non-EAL thread pick the highest lcore to insert timers
> into, there is the possibility of conflicts on this flag between threads. Then
> the atomic CAS operation is needed.
> 
> Use the c11 atomic CAS instead of the generic rte_atomic operations to avoid
> the unnecessary barrier on aarch64.
> 
> Signed-off-by: Phil Yang <phil.yang@arm.com>
> Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 4/4] eventdev: relax smp barriers with c11 atomics
  2020-07-02  5:26   ` [dpdk-dev] [PATCH v2 4/4] eventdev: relax smp barriers with c11 atomics Phil Yang
@ 2020-07-02 20:30     ` Carrillo, Erik G
  2020-07-03 10:50       ` Jerin Jacob
  2020-07-06 10:04     ` Thomas Monjalon
  1 sibling, 1 reply; 45+ messages in thread
From: Carrillo, Erik G @ 2020-07-02 20:30 UTC (permalink / raw)
  To: Phil Yang, dev
  Cc: jerinj, Honnappa.Nagarahalli, drc, Ruifeng.Wang, Dharmik.Thakkar, nd

> -----Original Message-----
> From: Phil Yang <phil.yang@arm.com>
> Sent: Thursday, July 2, 2020 12:27 AM
> To: Carrillo, Erik G <erik.g.carrillo@intel.com>; dev@dpdk.org
> Cc: jerinj@marvell.com; Honnappa.Nagarahalli@arm.com;
> drc@linux.vnet.ibm.com; Ruifeng.Wang@arm.com;
> Dharmik.Thakkar@arm.com; nd@arm.com
> Subject: [PATCH v2 4/4] eventdev: relax smp barriers with c11 atomics
> 
> The implementation-specific opaque data is shared between arm and cancel
> operations. The state flag acts as a guard variable to make sure the update of
> opaque data is synchronized. This patch uses c11 atomics with explicit one
> way memory barrier instead of full barriers rte_smp_w/rmb() to synchronize
> the opaque data between timer arm and cancel threads.
> 
> Signed-off-by: Phil Yang <phil.yang@arm.com>
> Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH 1/3] eventdev: fix race condition on timer list counter
  2020-07-02  3:56       ` Honnappa Nagarahalli
@ 2020-07-02 21:15         ` Carrillo, Erik G
  2020-07-02 21:30           ` Honnappa Nagarahalli
  0 siblings, 1 reply; 45+ messages in thread
From: Carrillo, Erik G @ 2020-07-02 21:15 UTC (permalink / raw)
  To: Honnappa Nagarahalli, Phil Yang, dev
  Cc: drc, Ruifeng Wang, Dharmik Thakkar, stable, nd, jerinj, nd



> -----Original Message-----
> From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> Sent: Wednesday, July 1, 2020 10:56 PM
> To: Phil Yang <Phil.Yang@arm.com>; Carrillo, Erik G
> <erik.g.carrillo@intel.com>; dev@dpdk.org
> Cc: drc@linux.vnet.ibm.com; Ruifeng Wang <Ruifeng.Wang@arm.com>;
> Dharmik Thakkar <Dharmik.Thakkar@arm.com>; stable@dpdk.org; nd
> <nd@arm.com>; jerinj@marvell.com; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; nd <nd@arm.com>
> Subject: RE: [PATCH 1/3] eventdev: fix race condition on timer list counter
> 
> <snip>
> > >
> > > >
> > > > Hi Phil,
> > > >
> > > > Good catch - thanks for the fix.   I've commented in-line:
> > > >
> > > > > -----Original Message-----
> > > > > From: Phil Yang <phil.yang@arm.com>
> > > > > Sent: Friday, June 12, 2020 6:20 AM
> > > > > To: dev@dpdk.org; Carrillo, Erik G <erik.g.carrillo@intel.com>
> > > > > Cc: drc@linux.vnet.ibm.com; honnappa.nagarahalli@arm.com;
> > > > > ruifeng.wang@arm.com; dharmik.thakkar@arm.com; nd@arm.com;
> > > > > stable@dpdk.org
> > > > > Subject: [PATCH 1/3] eventdev: fix race condition on timer list
> > > > > counter
> > > > >
> > > > > The n_poll_lcores counter and poll_lcore array are shared
> > > > > between lcores and the update of these variables are out of the
> > > > > protection of spinlock on each lcore timer list. The
> > > > > read-modify-write operations of the counter are not atomic, so
> > > > > it has the potential of race condition
> > > > between lcores.
> > > > >
> > > > > Use c11 atomics with RELAXED ordering to prevent confliction.
> > > > >
> > > > > Fixes: cc7b73ea9e3b ("eventdev: add new software timer adapter")
> > > > > Cc: erik.g.carrillo@intel.com
> > > > > Cc: stable@dpdk.org
> > > > >
> > > > > Signed-off-by: Phil Yang <phil.yang@arm.com>
> > > > > Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
> > > > > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > > > > ---
> > > > >  lib/librte_eventdev/rte_event_timer_adapter.c | 16
> > > > > ++++++++++++----
> > > > >  1 file changed, 12 insertions(+), 4 deletions(-)
> > > > >
> > > > > diff --git a/lib/librte_eventdev/rte_event_timer_adapter.c
> > > > > b/lib/librte_eventdev/rte_event_timer_adapter.c
> > > > > index 005459f..6a0e283 100644
> > > > > --- a/lib/librte_eventdev/rte_event_timer_adapter.c
> > > > > +++ b/lib/librte_eventdev/rte_event_timer_adapter.c
> > > > > @@ -583,6 +583,7 @@ swtim_callback(struct rte_timer *tim)
> > > > > uint16_t nb_evs_invalid = 0;  uint64_t opaque;  int ret;
> > > > > +int n_lcores;
> > > > >
> > > > >  opaque = evtim->impl_opaque[1];  adapter = (struct
> > > > > rte_event_timer_adapter *)(uintptr_t)opaque; @@
> > > > > -605,8 +606,12 @@ swtim_callback(struct rte_timer *tim)
> > > > >        "with immediate expiry value");  }
> > > > >
> > > > > -if (unlikely(rte_atomic16_test_and_set(&sw-
> > > > > >in_use[lcore].v)))
> > > > > -sw->poll_lcores[sw->n_poll_lcores++] = lcore;
> > > > > +if (unlikely(rte_atomic16_test_and_set(&sw-
> > > > > >in_use[lcore].v))) {
> > > > > +n_lcores = __atomic_fetch_add(&sw->n_poll_lcores,
> > > > > 1,
> > > > > +__ATOMIC_RELAXED);
> > > Since this commit will be back ported, we should prefer to use
> > > rte_atomic APIs for this commit. Otherwise, we will have a mix of
> > > rte_atomic and C11 APIs.
> > > My suggestion is to fix this bug using rte_atomic so that backported
> > > code will have only rte_atomic APIs. Add another commit (if
> > > required) in this series to make the bug fix use C11 APIs (this
> > > commit will not be
> > backported).
> >
> > Hi Honnappa,
> >
> > It doesn't have an applicable rte_atomic_XXX API to fix this issue.
> > The rte_atomic32_inc doesn't return the original value of the input
> > parameter and rte_atomic32_add_return can only return the new value.
> Ok, understood.
> 
> >
> > Meanwhile, the rte_timer_alt_manage & rte_timer_stop_all API not
> > support rte_atomic type parameters. We might need to rewrite these two
> > APIs if we want to use rte_atomic operations for n_pol_lcores and
> poll_lcores array.
> >
> > So, a better solution could be to backport the entire c11 solution to
> > stable releases.
> I am ok with the approach.
> Erik, are you ok with this?
> 

Yes, I'm OK with that as well.

Thanks,
Erik

> >
> > Thanks,
> > Phil
> >
> > >
> > > >
> > > > Just a nit, but let's align the continued line with the opening
> > > > parentheses in this location and below.  With these changes:
> > > >
> > > > Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
> > > >
> > > > > +__atomic_store_n(&sw->poll_lcores[n_lcores],
> > > > > lcore,
> > > > > +__ATOMIC_RELAXED);
> > > > > +}
> > > > >  } else {
> > > > >  EVTIM_BUF_LOG_DBG("buffered an event timer expiry event");
> > > > >
> > > > > @@ -1011,6 +1016,7 @@ __swtim_arm_burst(const struct
> > > > > rte_event_timer_adapter *adapter,  uint32_t lcore_id =
> > > > > rte_lcore_id();  struct rte_timer *tim, *tims[nb_evtims];
> > > > > uint64_t cycles;
> > > > > +int n_lcores;
> > > > >
> > > > >  #ifdef RTE_LIBRTE_EVENTDEV_DEBUG
> > > > >  /* Check that the service is running. */ @@ -1033,8 +1039,10 @@
> > > > > __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
> > > > > if
> > > > > (unlikely(rte_atomic16_test_and_set(&sw->in_use[lcore_id].v)))
> > > > > {  EVTIM_LOG_DBG("Adding lcore id = %u to list of lcores to
> > > > poll",
> > > > >        lcore_id);
> > > > > -sw->poll_lcores[sw->n_poll_lcores] = lcore_id;
> > > > > -++sw->n_poll_lcores;
> > > > > +n_lcores = __atomic_fetch_add(&sw->n_poll_lcores, 1,
> > > > > +__ATOMIC_RELAXED); __atomic_store_n(&sw-
> >poll_lcores[n_lcores],
> > > > > +lcore_id, __ATOMIC_RELAXED);
> > > > >  }
> > > > >
> > > > >  ret = rte_mempool_get_bulk(sw->tim_pool, (void **)tims,
> > > > > --
> > > > > 2.7.4
> > >


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH 1/3] eventdev: fix race condition on timer list counter
  2020-07-02 21:15         ` Carrillo, Erik G
@ 2020-07-02 21:30           ` Honnappa Nagarahalli
  0 siblings, 0 replies; 45+ messages in thread
From: Honnappa Nagarahalli @ 2020-07-02 21:30 UTC (permalink / raw)
  To: Carrillo, Erik G, Phil Yang, dev
  Cc: drc, Ruifeng Wang, Dharmik Thakkar, stable, nd, jerinj,
	Honnappa Nagarahalli, nd

<snip>

> > > >
> > > > >
> > > > > Hi Phil,
> > > > >
> > > > > Good catch - thanks for the fix.   I've commented in-line:
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Phil Yang <phil.yang@arm.com>
> > > > > > Sent: Friday, June 12, 2020 6:20 AM
> > > > > > To: dev@dpdk.org; Carrillo, Erik G <erik.g.carrillo@intel.com>
> > > > > > Cc: drc@linux.vnet.ibm.com; honnappa.nagarahalli@arm.com;
> > > > > > ruifeng.wang@arm.com; dharmik.thakkar@arm.com; nd@arm.com;
> > > > > > stable@dpdk.org
> > > > > > Subject: [PATCH 1/3] eventdev: fix race condition on timer
> > > > > > list counter
> > > > > >
> > > > > > The n_poll_lcores counter and poll_lcore array are shared
> > > > > > between lcores and the update of these variables are out of
> > > > > > the protection of spinlock on each lcore timer list. The
> > > > > > read-modify-write operations of the counter are not atomic, so
> > > > > > it has the potential of race condition
> > > > > between lcores.
> > > > > >
> > > > > > Use c11 atomics with RELAXED ordering to prevent confliction.
> > > > > >
> > > > > > Fixes: cc7b73ea9e3b ("eventdev: add new software timer
> > > > > > adapter")
> > > > > > Cc: erik.g.carrillo@intel.com
> > > > > > Cc: stable@dpdk.org
> > > > > >
> > > > > > Signed-off-by: Phil Yang <phil.yang@arm.com>
> > > > > > Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
> > > > > > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > > > > > ---
> > > > > >  lib/librte_eventdev/rte_event_timer_adapter.c | 16
> > > > > > ++++++++++++----
> > > > > >  1 file changed, 12 insertions(+), 4 deletions(-)
> > > > > >
> > > > > > diff --git a/lib/librte_eventdev/rte_event_timer_adapter.c
> > > > > > b/lib/librte_eventdev/rte_event_timer_adapter.c
> > > > > > index 005459f..6a0e283 100644
> > > > > > --- a/lib/librte_eventdev/rte_event_timer_adapter.c
> > > > > > +++ b/lib/librte_eventdev/rte_event_timer_adapter.c
> > > > > > @@ -583,6 +583,7 @@ swtim_callback(struct rte_timer *tim)
> > > > > > uint16_t nb_evs_invalid = 0;  uint64_t opaque;  int ret;
> > > > > > +int n_lcores;
> > > > > >
> > > > > >  opaque = evtim->impl_opaque[1];  adapter = (struct
> > > > > > rte_event_timer_adapter *)(uintptr_t)opaque; @@
> > > > > > -605,8 +606,12 @@ swtim_callback(struct rte_timer *tim)
> > > > > >        "with immediate expiry value");  }
> > > > > >
> > > > > > -if (unlikely(rte_atomic16_test_and_set(&sw-
> > > > > > >in_use[lcore].v)))
> > > > > > -sw->poll_lcores[sw->n_poll_lcores++] = lcore;
> > > > > > +if (unlikely(rte_atomic16_test_and_set(&sw-
> > > > > > >in_use[lcore].v))) {
> > > > > > +n_lcores = __atomic_fetch_add(&sw->n_poll_lcores,
> > > > > > 1,
> > > > > > +__ATOMIC_RELAXED);
> > > > Since this commit will be back ported, we should prefer to use
> > > > rte_atomic APIs for this commit. Otherwise, we will have a mix of
> > > > rte_atomic and C11 APIs.
> > > > My suggestion is to fix this bug using rte_atomic so that
> > > > backported code will have only rte_atomic APIs. Add another commit
> > > > (if
> > > > required) in this series to make the bug fix use C11 APIs (this
> > > > commit will not be
> > > backported).
> > >
> > > Hi Honnappa,
> > >
> > > It doesn't have an applicable rte_atomic_XXX API to fix this issue.
> > > The rte_atomic32_inc doesn't return the original value of the input
> > > parameter and rte_atomic32_add_return can only return the new value.
> > Ok, understood.
> >
> > >
> > > Meanwhile, the rte_timer_alt_manage & rte_timer_stop_all API not
> > > support rte_atomic type parameters. We might need to rewrite these
> > > two APIs if we want to use rte_atomic operations for n_pol_lcores
> > > and
> > poll_lcores array.
> > >
> > > So, a better solution could be to backport the entire c11 solution
> > > to stable releases.
> > I am ok with the approach.
> > Erik, are you ok with this?
> >
> 
> Yes, I'm OK with that as well.
Thanks Erik. I think we need to add stable@dpdk.org to Cc for all the commits in the series.

> 
> Thanks,
> Erik
> 
> > >
> > > Thanks,
> > > Phil
> > >
> > > >
> > > > >
> > > > > Just a nit, but let's align the continued line with the opening
> > > > > parentheses in this location and below.  With these changes:
> > > > >
> > > > > Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
> > > > >
> > > > > > +__atomic_store_n(&sw->poll_lcores[n_lcores],
> > > > > > lcore,
> > > > > > +__ATOMIC_RELAXED);
> > > > > > +}
> > > > > >  } else {
> > > > > >  EVTIM_BUF_LOG_DBG("buffered an event timer expiry event");
> > > > > >
> > > > > > @@ -1011,6 +1016,7 @@ __swtim_arm_burst(const struct
> > > > > > rte_event_timer_adapter *adapter,  uint32_t lcore_id =
> > > > > > rte_lcore_id();  struct rte_timer *tim, *tims[nb_evtims];
> > > > > > uint64_t cycles;
> > > > > > +int n_lcores;
> > > > > >
> > > > > >  #ifdef RTE_LIBRTE_EVENTDEV_DEBUG
> > > > > >  /* Check that the service is running. */ @@ -1033,8 +1039,10
> > > > > > @@ __swtim_arm_burst(const struct rte_event_timer_adapter
> > > > > > *adapter, if
> > > > > > (unlikely(rte_atomic16_test_and_set(&sw->in_use[lcore_id].v)))
> > > > > > {  EVTIM_LOG_DBG("Adding lcore id = %u to list of lcores to
> > > > > poll",
> > > > > >        lcore_id);
> > > > > > -sw->poll_lcores[sw->n_poll_lcores] = lcore_id;
> > > > > > -++sw->n_poll_lcores;
> > > > > > +n_lcores = __atomic_fetch_add(&sw->n_poll_lcores, 1,
> > > > > > +__ATOMIC_RELAXED); __atomic_store_n(&sw-
> > >poll_lcores[n_lcores],
> > > > > > +lcore_id, __ATOMIC_RELAXED);
> > > > > >  }
> > > > > >
> > > > > >  ret = rte_mempool_get_bulk(sw->tim_pool, (void **)tims,
> > > > > > --
> > > > > > 2.7.4
> > > >


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 3/4] eventdev: remove redundant code
  2020-07-02  5:26   ` [dpdk-dev] [PATCH v2 3/4] eventdev: remove redundant code Phil Yang
@ 2020-07-03  3:35     ` Dharmik Thakkar
  0 siblings, 0 replies; 45+ messages in thread
From: Dharmik Thakkar @ 2020-07-03  3:35 UTC (permalink / raw)
  To: Phil Yang
  Cc: erik.g.carrillo, dpdk-dev, jerinj, Honnappa Nagarahalli, drc,
	Ruifeng Wang, nd

Hi Phil,
Looks good.

> On Jul 2, 2020, at 12:26 AM, Phil Yang <Phil.Yang@arm.com> wrote:
> 
> There is no thread will access these impl_opaque data after timer
> canceled. When new timer armed, it got refilled. So the cleanup
> process is unnecessary.
> 
> Signed-off-by: Phil Yang <phil.yang@arm.com>
> ---
> lib/librte_eventdev/rte_event_timer_adapter.c | 2 --
> 1 file changed, 2 deletions(-)
> 
> diff --git a/lib/librte_eventdev/rte_event_timer_adapter.c b/lib/librte_eventdev/rte_event_timer_adapter.c
> index 7cc334c..8909a8c 100644
> --- a/lib/librte_eventdev/rte_event_timer_adapter.c
> +++ b/lib/librte_eventdev/rte_event_timer_adapter.c
> @@ -1167,8 +1167,6 @@ swtim_cancel_burst(const struct rte_event_timer_adapter *adapter,
> 		rte_mempool_put(sw->tim_pool, (void **)timp);
> 
> 		evtims[i]->state = RTE_EVENT_TIMER_CANCELED;
> -		evtims[i]->impl_opaque[0] = 0;
> -		evtims[i]->impl_opaque[1] = 0;
> 
> 		rte_smp_wmb();
> 	}
> — 
> 2.7.4
> 

Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 4/4] eventdev: relax smp barriers with c11 atomics
  2020-07-02 20:30     ` Carrillo, Erik G
@ 2020-07-03 10:50       ` Jerin Jacob
  0 siblings, 0 replies; 45+ messages in thread
From: Jerin Jacob @ 2020-07-03 10:50 UTC (permalink / raw)
  To: Carrillo, Erik G
  Cc: Phil Yang, dev, jerinj, Honnappa.Nagarahalli, drc, Ruifeng.Wang,
	Dharmik.Thakkar, nd

On Fri, Jul 3, 2020 at 2:00 AM Carrillo, Erik G
<erik.g.carrillo@intel.com> wrote:
>
> > -----Original Message-----
> > From: Phil Yang <phil.yang@arm.com>
> > Sent: Thursday, July 2, 2020 12:27 AM
> > To: Carrillo, Erik G <erik.g.carrillo@intel.com>; dev@dpdk.org
> > Cc: jerinj@marvell.com; Honnappa.Nagarahalli@arm.com;
> > drc@linux.vnet.ibm.com; Ruifeng.Wang@arm.com;
> > Dharmik.Thakkar@arm.com; nd@arm.com
> > Subject: [PATCH v2 4/4] eventdev: relax smp barriers with c11 atomics
> >
> > The implementation-specific opaque data is shared between arm and cancel
> > operations. The state flag acts as a guard variable to make sure the update of
> > opaque data is synchronized. This patch uses c11 atomics with explicit one
> > way memory barrier instead of full barriers rte_smp_w/rmb() to synchronize
> > the opaque data between timer arm and cancel threads.
> >
> > Signed-off-by: Phil Yang <phil.yang@arm.com>
> > Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
> > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>

Series applied to dpdk-next-eventdev/master. Thanks.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 4/4] eventdev: relax smp barriers with c11 atomics
  2020-07-02  5:26   ` [dpdk-dev] [PATCH v2 4/4] eventdev: relax smp barriers with c11 atomics Phil Yang
  2020-07-02 20:30     ` Carrillo, Erik G
@ 2020-07-06 10:04     ` Thomas Monjalon
  2020-07-06 15:32       ` Phil Yang
  1 sibling, 1 reply; 45+ messages in thread
From: Thomas Monjalon @ 2020-07-06 10:04 UTC (permalink / raw)
  To: Phil Yang
  Cc: erik.g.carrillo, dev, jerinj, Honnappa.Nagarahalli, drc,
	Ruifeng.Wang, Dharmik.Thakkar, nd, david.marchand, mdr,
	Neil Horman, Dodji Seketeli

02/07/2020 07:26, Phil Yang:
> The implementation-specific opaque data is shared between arm and cancel
> operations. The state flag acts as a guard variable to make sure the
> update of opaque data is synchronized. This patch uses c11 atomics with
> explicit one way memory barrier instead of full barriers rte_smp_w/rmb()
> to synchronize the opaque data between timer arm and cancel threads.

I think we should write C11 (uppercase).

Please, in your explanations, try to be more specific.
Naming fields may help to make things clear.

[...]
> --- a/lib/librte_eventdev/rte_event_timer_adapter.h
> +++ b/lib/librte_eventdev/rte_event_timer_adapter.h
> @@ -467,7 +467,7 @@ struct rte_event_timer {
>  	 *  - op: RTE_EVENT_OP_NEW
>  	 *  - event_type: RTE_EVENT_TYPE_TIMER
>  	 */
> -	volatile enum rte_event_timer_state state;
> +	enum rte_event_timer_state state;
>  	/**< State of the event timer. */

Why do you remove the volatile keyword?
It is not explained in the commit log.

This change is triggering a warning in the ABI check:
http://mails.dpdk.org/archives/test-report/2020-July/140440.html
Moving from volatile to non-volatile is probably not an issue.
I expect the code generated for the volatile case to work the same
in non-volatile case. Do you confirm?

In any case, we need an explanation and an ABI check exception.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 4/4] eventdev: relax smp barriers with c11 atomics
  2020-07-06 10:04     ` Thomas Monjalon
@ 2020-07-06 15:32       ` Phil Yang
  2020-07-06 15:40         ` Thomas Monjalon
  0 siblings, 1 reply; 45+ messages in thread
From: Phil Yang @ 2020-07-06 15:32 UTC (permalink / raw)
  To: thomas
  Cc: erik.g.carrillo, dev, jerinj, Honnappa Nagarahalli, drc,
	Ruifeng Wang, Dharmik Thakkar, nd, david.marchand, mdr,
	Neil Horman, Dodji Seketeli

> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Monday, July 6, 2020 6:04 PM
> To: Phil Yang <Phil.Yang@arm.com>
> Cc: erik.g.carrillo@intel.com; dev@dpdk.org; jerinj@marvell.com; Honnappa
> Nagarahalli <Honnappa.Nagarahalli@arm.com>; drc@linux.vnet.ibm.com;
> Ruifeng Wang <Ruifeng.Wang@arm.com>; Dharmik Thakkar
> <Dharmik.Thakkar@arm.com>; nd <nd@arm.com>;
> david.marchand@redhat.com; mdr@ashroe.eu; Neil Horman
> <nhorman@tuxdriver.com>; Dodji Seketeli <dodji@redhat.com>
> Subject: Re: [dpdk-dev] [PATCH v2 4/4] eventdev: relax smp barriers with c11
> atomics
> 
> 02/07/2020 07:26, Phil Yang:
> > The implementation-specific opaque data is shared between arm and
> cancel
> > operations. The state flag acts as a guard variable to make sure the
> > update of opaque data is synchronized. This patch uses c11 atomics with
> > explicit one way memory barrier instead of full barriers rte_smp_w/rmb()
> > to synchronize the opaque data between timer arm and cancel threads.
> 
> I think we should write C11 (uppercase).
Agreed. 
I will change it in the next version.

> 
> Please, in your explanations, try to be more specific.
> Naming fields may help to make things clear.
OK. Thanks.

> 
> [...]
> > --- a/lib/librte_eventdev/rte_event_timer_adapter.h
> > +++ b/lib/librte_eventdev/rte_event_timer_adapter.h
> > @@ -467,7 +467,7 @@ struct rte_event_timer {
> >  	 *  - op: RTE_EVENT_OP_NEW
> >  	 *  - event_type: RTE_EVENT_TYPE_TIMER
> >  	 */
> > -	volatile enum rte_event_timer_state state;
> > +	enum rte_event_timer_state state;
> >  	/**< State of the event timer. */
> 
> Why do you remove the volatile keyword?
> It is not explained in the commit log.
By using the C11 atomic operations, it will generate the same instructions for non-volatile and volatile version.
Please check the sample code here: https://gcc.godbolt.org/z/8x5rWs

> 
> This change is triggering a warning in the ABI check:
> http://mails.dpdk.org/archives/test-report/2020-July/140440.html
> Moving from volatile to non-volatile is probably not an issue.
> I expect the code generated for the volatile case to work the same
> in non-volatile case. Do you confirm?
They generate the same instructions, so either way will work.
Do I need to revert it to the volatile version?


Thanks,
Phil
> 
> In any case, we need an explanation and an ABI check exception.
> 


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 4/4] eventdev: relax smp barriers with c11 atomics
  2020-07-06 15:32       ` Phil Yang
@ 2020-07-06 15:40         ` Thomas Monjalon
  0 siblings, 0 replies; 45+ messages in thread
From: Thomas Monjalon @ 2020-07-06 15:40 UTC (permalink / raw)
  To: Phil Yang
  Cc: erik.g.carrillo, dev, jerinj, Honnappa Nagarahalli, drc,
	Ruifeng Wang, Dharmik Thakkar, nd, david.marchand, mdr,
	Neil Horman, Dodji Seketeli

06/07/2020 17:32, Phil Yang:
> From: Thomas Monjalon <thomas@monjalon.net>
> > 02/07/2020 07:26, Phil Yang:
> > > --- a/lib/librte_eventdev/rte_event_timer_adapter.h
> > > +++ b/lib/librte_eventdev/rte_event_timer_adapter.h
> > > @@ -467,7 +467,7 @@ struct rte_event_timer {
> > >  	 *  - op: RTE_EVENT_OP_NEW
> > >  	 *  - event_type: RTE_EVENT_TYPE_TIMER
> > >  	 */
> > > -	volatile enum rte_event_timer_state state;
> > > +	enum rte_event_timer_state state;
> > >  	/**< State of the event timer. */
> > 
> > Why do you remove the volatile keyword?
> > It is not explained in the commit log.
> By using the C11 atomic operations, it will generate the same instructions for non-volatile and volatile version.
> Please check the sample code here: https://gcc.godbolt.org/z/8x5rWs
> 
> > This change is triggering a warning in the ABI check:
> > http://mails.dpdk.org/archives/test-report/2020-July/140440.html
> > Moving from volatile to non-volatile is probably not an issue.
> > I expect the code generated for the volatile case to work the same
> > in non-volatile case. Do you confirm?
> They generate the same instructions, so either way will work.
> Do I need to revert it to the volatile version?

Either you revert, or you add explanation in the commit log
+ exception in libabigail.abignore



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH v3 1/4] eventdev: fix race condition on timer list counter
  2020-07-02  5:26 ` [dpdk-dev] [PATCH v2 1/4] " Phil Yang
                     ` (2 preceding siblings ...)
  2020-07-02  5:26   ` [dpdk-dev] [PATCH v2 4/4] eventdev: relax smp barriers with c11 atomics Phil Yang
@ 2020-07-07 11:13   ` Phil Yang
  2020-07-07 11:13     ` [dpdk-dev] [PATCH v3 2/4] eventdev: use C11 atomics for lcore timer armed flag Phil Yang
                       ` (3 more replies)
  3 siblings, 4 replies; 45+ messages in thread
From: Phil Yang @ 2020-07-07 11:13 UTC (permalink / raw)
  To: thomas, erik.g.carrillo, dev
  Cc: jerinj, Honnappa.Nagarahalli, drc, Ruifeng.Wang, Dharmik.Thakkar,
	nd, david.marchand, mdr, nhorman, dodji, stable

The n_poll_lcores counter and poll_lcore array are shared between lcores
and the update of these variables are out of the protection of spinlock
on each lcore timer list. The read-modify-write operations of the counter
are not atomic, so it has the potential of race condition between lcores.

Use c11 atomics with RELAXED ordering to prevent confliction.

Fixes: cc7b73ea9e3b ("eventdev: add new software timer adapter")
Cc: erik.g.carrillo@intel.com
Cc: stable@dpdk.org

Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
---
v2:
Align the code. (Erik)

 lib/librte_eventdev/rte_event_timer_adapter.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/lib/librte_eventdev/rte_event_timer_adapter.c b/lib/librte_eventdev/rte_event_timer_adapter.c
index 2321803..370ea40 100644
--- a/lib/librte_eventdev/rte_event_timer_adapter.c
+++ b/lib/librte_eventdev/rte_event_timer_adapter.c
@@ -583,6 +583,7 @@ swtim_callback(struct rte_timer *tim)
 	uint16_t nb_evs_invalid = 0;
 	uint64_t opaque;
 	int ret;
+	int n_lcores;
 
 	opaque = evtim->impl_opaque[1];
 	adapter = (struct rte_event_timer_adapter *)(uintptr_t)opaque;
@@ -605,8 +606,12 @@ swtim_callback(struct rte_timer *tim)
 				      "with immediate expiry value");
 		}
 
-		if (unlikely(rte_atomic16_test_and_set(&sw->in_use[lcore].v)))
-			sw->poll_lcores[sw->n_poll_lcores++] = lcore;
+		if (unlikely(rte_atomic16_test_and_set(&sw->in_use[lcore].v))) {
+			n_lcores = __atomic_fetch_add(&sw->n_poll_lcores, 1,
+						     __ATOMIC_RELAXED);
+			__atomic_store_n(&sw->poll_lcores[n_lcores], lcore,
+					__ATOMIC_RELAXED);
+		}
 	} else {
 		EVTIM_BUF_LOG_DBG("buffered an event timer expiry event");
 
@@ -1011,6 +1016,7 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
 	uint32_t lcore_id = rte_lcore_id();
 	struct rte_timer *tim, *tims[nb_evtims];
 	uint64_t cycles;
+	int n_lcores;
 
 #ifdef RTE_LIBRTE_EVENTDEV_DEBUG
 	/* Check that the service is running. */
@@ -1033,8 +1039,10 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
 	if (unlikely(rte_atomic16_test_and_set(&sw->in_use[lcore_id].v))) {
 		EVTIM_LOG_DBG("Adding lcore id = %u to list of lcores to poll",
 			      lcore_id);
-		sw->poll_lcores[sw->n_poll_lcores] = lcore_id;
-		++sw->n_poll_lcores;
+		n_lcores = __atomic_fetch_add(&sw->n_poll_lcores, 1,
+					     __ATOMIC_RELAXED);
+		__atomic_store_n(&sw->poll_lcores[n_lcores], lcore_id,
+				__ATOMIC_RELAXED);
 	}
 
 	ret = rte_mempool_get_bulk(sw->tim_pool, (void **)tims,
-- 
2.7.4


^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH v3 2/4] eventdev: use C11 atomics for lcore timer armed flag
  2020-07-07 11:13   ` [dpdk-dev] [PATCH v3 1/4] eventdev: fix race condition on timer list counter Phil Yang
@ 2020-07-07 11:13     ` Phil Yang
  2020-07-07 11:13     ` [dpdk-dev] [PATCH v3 3/4] eventdev: remove redundant code Phil Yang
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 45+ messages in thread
From: Phil Yang @ 2020-07-07 11:13 UTC (permalink / raw)
  To: thomas, erik.g.carrillo, dev
  Cc: jerinj, Honnappa.Nagarahalli, drc, Ruifeng.Wang, Dharmik.Thakkar,
	nd, david.marchand, mdr, nhorman, dodji

The in_use flag is a per core variable which is not shared between
lcores in the normal case and the access of this variable should be
ordered on the same core. However, if non-EAL thread pick the highest
lcore to insert timers into, there is the possibility of conflicts
on this flag between threads. Then the atomic CAS operation is needed.

Use the C11 atomic CAS instead of the generic rte_atomic operations
to avoid the unnecessary barrier on aarch64.

Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
---
v2:
1. Make the code comments more accurate. (Erik)
2. Define the in_use flag as an unsigned type. (Stephen)

 lib/librte_eventdev/rte_event_timer_adapter.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/lib/librte_eventdev/rte_event_timer_adapter.c b/lib/librte_eventdev/rte_event_timer_adapter.c
index 370ea40..6d01a34 100644
--- a/lib/librte_eventdev/rte_event_timer_adapter.c
+++ b/lib/librte_eventdev/rte_event_timer_adapter.c
@@ -554,7 +554,7 @@ struct swtim {
 	uint32_t timer_data_id;
 	/* Track which cores have actually armed a timer */
 	struct {
-		rte_atomic16_t v;
+		uint16_t v;
 	} __rte_cache_aligned in_use[RTE_MAX_LCORE];
 	/* Track which cores' timer lists should be polled */
 	unsigned int poll_lcores[RTE_MAX_LCORE];
@@ -606,7 +606,8 @@ swtim_callback(struct rte_timer *tim)
 				      "with immediate expiry value");
 		}
 
-		if (unlikely(rte_atomic16_test_and_set(&sw->in_use[lcore].v))) {
+		if (unlikely(sw->in_use[lcore].v == 0)) {
+			sw->in_use[lcore].v = 1;
 			n_lcores = __atomic_fetch_add(&sw->n_poll_lcores, 1,
 						     __ATOMIC_RELAXED);
 			__atomic_store_n(&sw->poll_lcores[n_lcores], lcore,
@@ -834,7 +835,7 @@ swtim_init(struct rte_event_timer_adapter *adapter)
 
 	/* Initialize the variables that track in-use timer lists */
 	for (i = 0; i < RTE_MAX_LCORE; i++)
-		rte_atomic16_init(&sw->in_use[i].v);
+		sw->in_use[i].v = 0;
 
 	/* Initialize the timer subsystem and allocate timer data instance */
 	ret = rte_timer_subsystem_init();
@@ -1017,6 +1018,8 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
 	struct rte_timer *tim, *tims[nb_evtims];
 	uint64_t cycles;
 	int n_lcores;
+	/* Timer list for this lcore is not in use. */
+	uint16_t exp_state = 0;
 
 #ifdef RTE_LIBRTE_EVENTDEV_DEBUG
 	/* Check that the service is running. */
@@ -1035,8 +1038,12 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
 	/* If this is the first time we're arming an event timer on this lcore,
 	 * mark this lcore as "in use"; this will cause the service
 	 * function to process the timer list that corresponds to this lcore.
+	 * The atomic CAS operation can prevent the race condition on in_use
+	 * flag between multiple non-EAL threads.
 	 */
-	if (unlikely(rte_atomic16_test_and_set(&sw->in_use[lcore_id].v))) {
+	if (unlikely(__atomic_compare_exchange_n(&sw->in_use[lcore_id].v,
+			&exp_state, 1, 0,
+			__ATOMIC_RELAXED, __ATOMIC_RELAXED))) {
 		EVTIM_LOG_DBG("Adding lcore id = %u to list of lcores to poll",
 			      lcore_id);
 		n_lcores = __atomic_fetch_add(&sw->n_poll_lcores, 1,
-- 
2.7.4


^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH v3 3/4] eventdev: remove redundant code
  2020-07-07 11:13   ` [dpdk-dev] [PATCH v3 1/4] eventdev: fix race condition on timer list counter Phil Yang
  2020-07-07 11:13     ` [dpdk-dev] [PATCH v3 2/4] eventdev: use C11 atomics for lcore timer armed flag Phil Yang
@ 2020-07-07 11:13     ` Phil Yang
  2020-07-07 11:13     ` [dpdk-dev] [PATCH v3 4/4] eventdev: relax smp barriers with C11 atomics Phil Yang
  2020-07-07 15:54     ` [dpdk-dev] [PATCH v4 1/4] eventdev: fix race condition on timer list counter Phil Yang
  3 siblings, 0 replies; 45+ messages in thread
From: Phil Yang @ 2020-07-07 11:13 UTC (permalink / raw)
  To: thomas, erik.g.carrillo, dev
  Cc: jerinj, Honnappa.Nagarahalli, drc, Ruifeng.Wang, Dharmik.Thakkar,
	nd, david.marchand, mdr, nhorman, dodji

There is no thread will access these impl_opaque data after timer
canceled. When new timer armed, it got refilled. So the cleanup
process is unnecessary.

Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
---
 lib/librte_eventdev/rte_event_timer_adapter.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/lib/librte_eventdev/rte_event_timer_adapter.c b/lib/librte_eventdev/rte_event_timer_adapter.c
index 6d01a34..d75415c 100644
--- a/lib/librte_eventdev/rte_event_timer_adapter.c
+++ b/lib/librte_eventdev/rte_event_timer_adapter.c
@@ -1167,8 +1167,6 @@ swtim_cancel_burst(const struct rte_event_timer_adapter *adapter,
 		rte_mempool_put(sw->tim_pool, (void **)timp);
 
 		evtims[i]->state = RTE_EVENT_TIMER_CANCELED;
-		evtims[i]->impl_opaque[0] = 0;
-		evtims[i]->impl_opaque[1] = 0;
 
 		rte_smp_wmb();
 	}
-- 
2.7.4


^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH v3 4/4] eventdev: relax smp barriers with C11 atomics
  2020-07-07 11:13   ` [dpdk-dev] [PATCH v3 1/4] eventdev: fix race condition on timer list counter Phil Yang
  2020-07-07 11:13     ` [dpdk-dev] [PATCH v3 2/4] eventdev: use C11 atomics for lcore timer armed flag Phil Yang
  2020-07-07 11:13     ` [dpdk-dev] [PATCH v3 3/4] eventdev: remove redundant code Phil Yang
@ 2020-07-07 11:13     ` Phil Yang
  2020-07-07 14:29       ` Jerin Jacob
  2020-07-07 15:54     ` [dpdk-dev] [PATCH v4 1/4] eventdev: fix race condition on timer list counter Phil Yang
  3 siblings, 1 reply; 45+ messages in thread
From: Phil Yang @ 2020-07-07 11:13 UTC (permalink / raw)
  To: thomas, erik.g.carrillo, dev
  Cc: jerinj, Honnappa.Nagarahalli, drc, Ruifeng.Wang, Dharmik.Thakkar,
	nd, david.marchand, mdr, nhorman, dodji

The impl_opaque field is shared between the timer arm and cancel
operations. Meanwhile, the state flag acts as a guard variable to
make sure the update of impl_opaque is synchronized. The original
code uses rte_smp barriers to achieve that. This patch uses C11
atomics with an explicit one-way memory barrier instead of full
barriers rte_smp_w/rmb() to avoid the unnecessary barrier on aarch64.

Since compilers can generate the same instructions for volatile and
non-volatile variable in C11 __atomics built-ins, so remain the volatile
keyword in front of state enum to avoid the ABI break issue.

Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
---
v3:
Fix ABI issue: revert to 'volatile enum rte_event_timer_state type state'.

v2:
1. Removed implementation-specific opaque data cleanup code.
2. Replaced thread fence with atomic ACQURE/RELEASE ordering on state access.

 lib/librte_eventdev/rte_event_timer_adapter.c | 55 ++++++++++++++++++---------
 1 file changed, 37 insertions(+), 18 deletions(-)

diff --git a/lib/librte_eventdev/rte_event_timer_adapter.c b/lib/librte_eventdev/rte_event_timer_adapter.c
index d75415c..eb2c93a 100644
--- a/lib/librte_eventdev/rte_event_timer_adapter.c
+++ b/lib/librte_eventdev/rte_event_timer_adapter.c
@@ -629,7 +629,8 @@ swtim_callback(struct rte_timer *tim)
 		sw->expired_timers[sw->n_expired_timers++] = tim;
 		sw->stats.evtim_exp_count++;
 
-		evtim->state = RTE_EVENT_TIMER_NOT_ARMED;
+		__atomic_store_n(&evtim->state, RTE_EVENT_TIMER_NOT_ARMED,
+				__ATOMIC_RELEASE);
 	}
 
 	if (event_buffer_batch_ready(&sw->buffer)) {
@@ -1020,6 +1021,7 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
 	int n_lcores;
 	/* Timer list for this lcore is not in use. */
 	uint16_t exp_state = 0;
+	enum rte_event_timer_state n_state;
 
 #ifdef RTE_LIBRTE_EVENTDEV_DEBUG
 	/* Check that the service is running. */
@@ -1060,30 +1062,36 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
 	}
 
 	for (i = 0; i < nb_evtims; i++) {
-		/* Don't modify the event timer state in these cases */
-		if (evtims[i]->state == RTE_EVENT_TIMER_ARMED) {
+		n_state = __atomic_load_n(&evtims[i]->state, __ATOMIC_ACQUIRE);
+		if (n_state == RTE_EVENT_TIMER_ARMED) {
 			rte_errno = EALREADY;
 			break;
-		} else if (!(evtims[i]->state == RTE_EVENT_TIMER_NOT_ARMED ||
-			     evtims[i]->state == RTE_EVENT_TIMER_CANCELED)) {
+		} else if (!(n_state == RTE_EVENT_TIMER_NOT_ARMED ||
+			     n_state == RTE_EVENT_TIMER_CANCELED)) {
 			rte_errno = EINVAL;
 			break;
 		}
 
 		ret = check_timeout(evtims[i], adapter);
 		if (unlikely(ret == -1)) {
-			evtims[i]->state = RTE_EVENT_TIMER_ERROR_TOOLATE;
+			__atomic_store_n(&evtims[i]->state,
+					RTE_EVENT_TIMER_ERROR_TOOLATE,
+					__ATOMIC_RELAXED);
 			rte_errno = EINVAL;
 			break;
 		} else if (unlikely(ret == -2)) {
-			evtims[i]->state = RTE_EVENT_TIMER_ERROR_TOOEARLY;
+			__atomic_store_n(&evtims[i]->state,
+					RTE_EVENT_TIMER_ERROR_TOOEARLY,
+					__ATOMIC_RELAXED);
 			rte_errno = EINVAL;
 			break;
 		}
 
 		if (unlikely(check_destination_event_queue(evtims[i],
 							   adapter) < 0)) {
-			evtims[i]->state = RTE_EVENT_TIMER_ERROR;
+			__atomic_store_n(&evtims[i]->state,
+					RTE_EVENT_TIMER_ERROR,
+					__ATOMIC_RELAXED);
 			rte_errno = EINVAL;
 			break;
 		}
@@ -1099,13 +1107,18 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
 					  SINGLE, lcore_id, NULL, evtims[i]);
 		if (ret < 0) {
 			/* tim was in RUNNING or CONFIG state */
-			evtims[i]->state = RTE_EVENT_TIMER_ERROR;
+			__atomic_store_n(&evtims[i]->state,
+					RTE_EVENT_TIMER_ERROR,
+					__ATOMIC_RELEASE);
 			break;
 		}
 
-		rte_smp_wmb();
 		EVTIM_LOG_DBG("armed an event timer");
-		evtims[i]->state = RTE_EVENT_TIMER_ARMED;
+		/* RELEASE ordering guarantees the adapter specific value
+		 * changes observed before the update of state.
+		 */
+		__atomic_store_n(&evtims[i]->state, RTE_EVENT_TIMER_ARMED,
+				__ATOMIC_RELEASE);
 	}
 
 	if (i < nb_evtims)
@@ -1132,6 +1145,7 @@ swtim_cancel_burst(const struct rte_event_timer_adapter *adapter,
 	struct rte_timer *timp;
 	uint64_t opaque;
 	struct swtim *sw = swtim_pmd_priv(adapter);
+	enum rte_event_timer_state n_state;
 
 #ifdef RTE_LIBRTE_EVENTDEV_DEBUG
 	/* Check that the service is running. */
@@ -1143,16 +1157,18 @@ swtim_cancel_burst(const struct rte_event_timer_adapter *adapter,
 
 	for (i = 0; i < nb_evtims; i++) {
 		/* Don't modify the event timer state in these cases */
-		if (evtims[i]->state == RTE_EVENT_TIMER_CANCELED) {
+		/* ACQUIRE ordering guarantees the access of implementation
+		 * specific opague data under the correct state.
+		 */
+		n_state = __atomic_load_n(&evtims[i]->state, __ATOMIC_ACQUIRE);
+		if (n_state == RTE_EVENT_TIMER_CANCELED) {
 			rte_errno = EALREADY;
 			break;
-		} else if (evtims[i]->state != RTE_EVENT_TIMER_ARMED) {
+		} else if (n_state != RTE_EVENT_TIMER_ARMED) {
 			rte_errno = EINVAL;
 			break;
 		}
 
-		rte_smp_rmb();
-
 		opaque = evtims[i]->impl_opaque[0];
 		timp = (struct rte_timer *)(uintptr_t)opaque;
 		RTE_ASSERT(timp != NULL);
@@ -1166,9 +1182,12 @@ swtim_cancel_burst(const struct rte_event_timer_adapter *adapter,
 
 		rte_mempool_put(sw->tim_pool, (void **)timp);
 
-		evtims[i]->state = RTE_EVENT_TIMER_CANCELED;
-
-		rte_smp_wmb();
+		/* The RELEASE ordering here pairs with atomic ordering
+		 * to make sure the state update data observed between
+		 * threads.
+		 */
+		__atomic_store_n(&evtims[i]->state, RTE_EVENT_TIMER_CANCELED,
+				__ATOMIC_RELEASE);
 	}
 
 	return i;
-- 
2.7.4


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v3 4/4] eventdev: relax smp barriers with C11 atomics
  2020-07-07 11:13     ` [dpdk-dev] [PATCH v3 4/4] eventdev: relax smp barriers with C11 atomics Phil Yang
@ 2020-07-07 14:29       ` Jerin Jacob
  2020-07-07 15:56         ` Phil Yang
  0 siblings, 1 reply; 45+ messages in thread
From: Jerin Jacob @ 2020-07-07 14:29 UTC (permalink / raw)
  To: Phil Yang
  Cc: Thomas Monjalon, Erik Gabriel Carrillo, dpdk-dev, Jerin Jacob,
	Honnappa Nagarahalli, David Christensen,
	Ruifeng Wang (Arm Technology China),
	Dharmik Thakkar, nd, David Marchand, Ray Kinsella, Neil Horman,
	dodji

On Tue, Jul 7, 2020 at 4:45 PM Phil Yang <phil.yang@arm.com> wrote:
>
> The impl_opaque field is shared between the timer arm and cancel
> operations. Meanwhile, the state flag acts as a guard variable to
> make sure the update of impl_opaque is synchronized. The original
> code uses rte_smp barriers to achieve that. This patch uses C11
> atomics with an explicit one-way memory barrier instead of full
> barriers rte_smp_w/rmb() to avoid the unnecessary barrier on aarch64.
>
> Since compilers can generate the same instructions for volatile and
> non-volatile variable in C11 __atomics built-ins, so remain the volatile
> keyword in front of state enum to avoid the ABI break issue.
>
> Signed-off-by: Phil Yang <phil.yang@arm.com>
> Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>


Could you fix the following:

WARNING:TYPO_SPELLING: 'opague' may be misspelled - perhaps 'opaque'?
#184: FILE: lib/librte_eventdev/rte_event_timer_adapter.c:1161:
+ * specific opague data under the correct state.

total: 0 errors, 1 warnings, 124 lines checked


> ---
> v3:
> Fix ABI issue: revert to 'volatile enum rte_event_timer_state type state'.
>
> v2:
> 1. Removed implementation-specific opaque data cleanup code.
> 2. Replaced thread fence with atomic ACQURE/RELEASE ordering on state access.
>
>  lib/librte_eventdev/rte_event_timer_adapter.c | 55 ++++++++++++++++++---------
>  1 file changed, 37 insertions(+), 18 deletions(-)
>
> diff --git a/lib/librte_eventdev/rte_event_timer_adapter.c b/lib/librte_eventdev/rte_event_timer_adapter.c
> index d75415c..eb2c93a 100644
> --- a/lib/librte_eventdev/rte_event_timer_adapter.c
> +++ b/lib/librte_eventdev/rte_event_timer_adapter.c
> @@ -629,7 +629,8 @@ swtim_callback(struct rte_timer *tim)
>                 sw->expired_timers[sw->n_expired_timers++] = tim;
>                 sw->stats.evtim_exp_count++;
>
> -               evtim->state = RTE_EVENT_TIMER_NOT_ARMED;
> +               __atomic_store_n(&evtim->state, RTE_EVENT_TIMER_NOT_ARMED,
> +                               __ATOMIC_RELEASE);
>         }
>
>         if (event_buffer_batch_ready(&sw->buffer)) {
> @@ -1020,6 +1021,7 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
>         int n_lcores;
>         /* Timer list for this lcore is not in use. */
>         uint16_t exp_state = 0;
> +       enum rte_event_timer_state n_state;
>
>  #ifdef RTE_LIBRTE_EVENTDEV_DEBUG
>         /* Check that the service is running. */
> @@ -1060,30 +1062,36 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
>         }
>
>         for (i = 0; i < nb_evtims; i++) {
> -               /* Don't modify the event timer state in these cases */
> -               if (evtims[i]->state == RTE_EVENT_TIMER_ARMED) {
> +               n_state = __atomic_load_n(&evtims[i]->state, __ATOMIC_ACQUIRE);
> +               if (n_state == RTE_EVENT_TIMER_ARMED) {
>                         rte_errno = EALREADY;
>                         break;
> -               } else if (!(evtims[i]->state == RTE_EVENT_TIMER_NOT_ARMED ||
> -                            evtims[i]->state == RTE_EVENT_TIMER_CANCELED)) {
> +               } else if (!(n_state == RTE_EVENT_TIMER_NOT_ARMED ||
> +                            n_state == RTE_EVENT_TIMER_CANCELED)) {
>                         rte_errno = EINVAL;
>                         break;
>                 }
>
>                 ret = check_timeout(evtims[i], adapter);
>                 if (unlikely(ret == -1)) {
> -                       evtims[i]->state = RTE_EVENT_TIMER_ERROR_TOOLATE;
> +                       __atomic_store_n(&evtims[i]->state,
> +                                       RTE_EVENT_TIMER_ERROR_TOOLATE,
> +                                       __ATOMIC_RELAXED);
>                         rte_errno = EINVAL;
>                         break;
>                 } else if (unlikely(ret == -2)) {
> -                       evtims[i]->state = RTE_EVENT_TIMER_ERROR_TOOEARLY;
> +                       __atomic_store_n(&evtims[i]->state,
> +                                       RTE_EVENT_TIMER_ERROR_TOOEARLY,
> +                                       __ATOMIC_RELAXED);
>                         rte_errno = EINVAL;
>                         break;
>                 }
>
>                 if (unlikely(check_destination_event_queue(evtims[i],
>                                                            adapter) < 0)) {
> -                       evtims[i]->state = RTE_EVENT_TIMER_ERROR;
> +                       __atomic_store_n(&evtims[i]->state,
> +                                       RTE_EVENT_TIMER_ERROR,
> +                                       __ATOMIC_RELAXED);
>                         rte_errno = EINVAL;
>                         break;
>                 }
> @@ -1099,13 +1107,18 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
>                                           SINGLE, lcore_id, NULL, evtims[i]);
>                 if (ret < 0) {
>                         /* tim was in RUNNING or CONFIG state */
> -                       evtims[i]->state = RTE_EVENT_TIMER_ERROR;
> +                       __atomic_store_n(&evtims[i]->state,
> +                                       RTE_EVENT_TIMER_ERROR,
> +                                       __ATOMIC_RELEASE);
>                         break;
>                 }
>
> -               rte_smp_wmb();
>                 EVTIM_LOG_DBG("armed an event timer");
> -               evtims[i]->state = RTE_EVENT_TIMER_ARMED;
> +               /* RELEASE ordering guarantees the adapter specific value
> +                * changes observed before the update of state.
> +                */
> +               __atomic_store_n(&evtims[i]->state, RTE_EVENT_TIMER_ARMED,
> +                               __ATOMIC_RELEASE);
>         }
>
>         if (i < nb_evtims)
> @@ -1132,6 +1145,7 @@ swtim_cancel_burst(const struct rte_event_timer_adapter *adapter,
>         struct rte_timer *timp;
>         uint64_t opaque;
>         struct swtim *sw = swtim_pmd_priv(adapter);
> +       enum rte_event_timer_state n_state;
>
>  #ifdef RTE_LIBRTE_EVENTDEV_DEBUG
>         /* Check that the service is running. */
> @@ -1143,16 +1157,18 @@ swtim_cancel_burst(const struct rte_event_timer_adapter *adapter,
>
>         for (i = 0; i < nb_evtims; i++) {
>                 /* Don't modify the event timer state in these cases */
> -               if (evtims[i]->state == RTE_EVENT_TIMER_CANCELED) {
> +               /* ACQUIRE ordering guarantees the access of implementation
> +                * specific opague data under the correct state.
> +                */
> +               n_state = __atomic_load_n(&evtims[i]->state, __ATOMIC_ACQUIRE);
> +               if (n_state == RTE_EVENT_TIMER_CANCELED) {
>                         rte_errno = EALREADY;
>                         break;
> -               } else if (evtims[i]->state != RTE_EVENT_TIMER_ARMED) {
> +               } else if (n_state != RTE_EVENT_TIMER_ARMED) {
>                         rte_errno = EINVAL;
>                         break;
>                 }
>
> -               rte_smp_rmb();
> -
>                 opaque = evtims[i]->impl_opaque[0];
>                 timp = (struct rte_timer *)(uintptr_t)opaque;
>                 RTE_ASSERT(timp != NULL);
> @@ -1166,9 +1182,12 @@ swtim_cancel_burst(const struct rte_event_timer_adapter *adapter,
>
>                 rte_mempool_put(sw->tim_pool, (void **)timp);
>
> -               evtims[i]->state = RTE_EVENT_TIMER_CANCELED;
> -
> -               rte_smp_wmb();
> +               /* The RELEASE ordering here pairs with atomic ordering
> +                * to make sure the state update data observed between
> +                * threads.
> +                */
> +               __atomic_store_n(&evtims[i]->state, RTE_EVENT_TIMER_CANCELED,
> +                               __ATOMIC_RELEASE);
>         }
>
>         return i;
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH v4 1/4] eventdev: fix race condition on timer list counter
  2020-07-07 11:13   ` [dpdk-dev] [PATCH v3 1/4] eventdev: fix race condition on timer list counter Phil Yang
                       ` (2 preceding siblings ...)
  2020-07-07 11:13     ` [dpdk-dev] [PATCH v3 4/4] eventdev: relax smp barriers with C11 atomics Phil Yang
@ 2020-07-07 15:54     ` Phil Yang
  2020-07-07 15:54       ` [dpdk-dev] [PATCH v4 2/4] eventdev: use C11 atomics for lcore timer armed flag Phil Yang
                         ` (3 more replies)
  3 siblings, 4 replies; 45+ messages in thread
From: Phil Yang @ 2020-07-07 15:54 UTC (permalink / raw)
  To: jerinj, dev
  Cc: thomas, erik.g.carrillo, Honnappa.Nagarahalli, drc, Ruifeng.Wang,
	Dharmik.Thakkar, nd, david.marchand, mdr, nhorman, dodji, stable

The n_poll_lcores counter and poll_lcore array are shared between lcores
and the update of these variables are out of the protection of spinlock
on each lcore timer list. The read-modify-write operations of the counter
are not atomic, so it has the potential of race condition between lcores.

Use c11 atomics with RELAXED ordering to prevent confliction.

Fixes: cc7b73ea9e3b ("eventdev: add new software timer adapter")
Cc: erik.g.carrillo@intel.com
Cc: stable@dpdk.org

Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
---
v2:
Align the code. (Erik)

 lib/librte_eventdev/rte_event_timer_adapter.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/lib/librte_eventdev/rte_event_timer_adapter.c b/lib/librte_eventdev/rte_event_timer_adapter.c
index 2321803..370ea40 100644
--- a/lib/librte_eventdev/rte_event_timer_adapter.c
+++ b/lib/librte_eventdev/rte_event_timer_adapter.c
@@ -583,6 +583,7 @@ swtim_callback(struct rte_timer *tim)
 	uint16_t nb_evs_invalid = 0;
 	uint64_t opaque;
 	int ret;
+	int n_lcores;
 
 	opaque = evtim->impl_opaque[1];
 	adapter = (struct rte_event_timer_adapter *)(uintptr_t)opaque;
@@ -605,8 +606,12 @@ swtim_callback(struct rte_timer *tim)
 				      "with immediate expiry value");
 		}
 
-		if (unlikely(rte_atomic16_test_and_set(&sw->in_use[lcore].v)))
-			sw->poll_lcores[sw->n_poll_lcores++] = lcore;
+		if (unlikely(rte_atomic16_test_and_set(&sw->in_use[lcore].v))) {
+			n_lcores = __atomic_fetch_add(&sw->n_poll_lcores, 1,
+						     __ATOMIC_RELAXED);
+			__atomic_store_n(&sw->poll_lcores[n_lcores], lcore,
+					__ATOMIC_RELAXED);
+		}
 	} else {
 		EVTIM_BUF_LOG_DBG("buffered an event timer expiry event");
 
@@ -1011,6 +1016,7 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
 	uint32_t lcore_id = rte_lcore_id();
 	struct rte_timer *tim, *tims[nb_evtims];
 	uint64_t cycles;
+	int n_lcores;
 
 #ifdef RTE_LIBRTE_EVENTDEV_DEBUG
 	/* Check that the service is running. */
@@ -1033,8 +1039,10 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
 	if (unlikely(rte_atomic16_test_and_set(&sw->in_use[lcore_id].v))) {
 		EVTIM_LOG_DBG("Adding lcore id = %u to list of lcores to poll",
 			      lcore_id);
-		sw->poll_lcores[sw->n_poll_lcores] = lcore_id;
-		++sw->n_poll_lcores;
+		n_lcores = __atomic_fetch_add(&sw->n_poll_lcores, 1,
+					     __ATOMIC_RELAXED);
+		__atomic_store_n(&sw->poll_lcores[n_lcores], lcore_id,
+				__ATOMIC_RELAXED);
 	}
 
 	ret = rte_mempool_get_bulk(sw->tim_pool, (void **)tims,
-- 
2.7.4


^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH v4 2/4] eventdev: use C11 atomics for lcore timer armed flag
  2020-07-07 15:54     ` [dpdk-dev] [PATCH v4 1/4] eventdev: fix race condition on timer list counter Phil Yang
@ 2020-07-07 15:54       ` Phil Yang
  2020-07-07 15:54       ` [dpdk-dev] [PATCH v4 3/4] eventdev: remove redundant code Phil Yang
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 45+ messages in thread
From: Phil Yang @ 2020-07-07 15:54 UTC (permalink / raw)
  To: jerinj, dev
  Cc: thomas, erik.g.carrillo, Honnappa.Nagarahalli, drc, Ruifeng.Wang,
	Dharmik.Thakkar, nd, david.marchand, mdr, nhorman, dodji, stable

The in_use flag is a per core variable which is not shared between
lcores in the normal case and the access of this variable should be
ordered on the same core. However, if non-EAL thread pick the highest
lcore to insert timers into, there is the possibility of conflicts
on this flag between threads. Then the atomic compare-and-swap
operation is needed.

Use the C11 atomics instead of the generic rte_atomic operations to
avoid the unnecessary barrier on aarch64.

Cc: stable@dpdk.org

Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
---
v4:
1.Change 'CAS' to 'compare-and-swap' to pass the coding style check.
2.Cc to stable release. (Honnappa)

v2:
1. Make the code comments more accurate. (Erik)
2. Define the in_use flag as an unsigned type. (Stephen)

 lib/librte_eventdev/rte_event_timer_adapter.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/lib/librte_eventdev/rte_event_timer_adapter.c b/lib/librte_eventdev/rte_event_timer_adapter.c
index 370ea40..4ed3013 100644
--- a/lib/librte_eventdev/rte_event_timer_adapter.c
+++ b/lib/librte_eventdev/rte_event_timer_adapter.c
@@ -554,7 +554,7 @@ struct swtim {
 	uint32_t timer_data_id;
 	/* Track which cores have actually armed a timer */
 	struct {
-		rte_atomic16_t v;
+		uint16_t v;
 	} __rte_cache_aligned in_use[RTE_MAX_LCORE];
 	/* Track which cores' timer lists should be polled */
 	unsigned int poll_lcores[RTE_MAX_LCORE];
@@ -606,7 +606,8 @@ swtim_callback(struct rte_timer *tim)
 				      "with immediate expiry value");
 		}
 
-		if (unlikely(rte_atomic16_test_and_set(&sw->in_use[lcore].v))) {
+		if (unlikely(sw->in_use[lcore].v == 0)) {
+			sw->in_use[lcore].v = 1;
 			n_lcores = __atomic_fetch_add(&sw->n_poll_lcores, 1,
 						     __ATOMIC_RELAXED);
 			__atomic_store_n(&sw->poll_lcores[n_lcores], lcore,
@@ -834,7 +835,7 @@ swtim_init(struct rte_event_timer_adapter *adapter)
 
 	/* Initialize the variables that track in-use timer lists */
 	for (i = 0; i < RTE_MAX_LCORE; i++)
-		rte_atomic16_init(&sw->in_use[i].v);
+		sw->in_use[i].v = 0;
 
 	/* Initialize the timer subsystem and allocate timer data instance */
 	ret = rte_timer_subsystem_init();
@@ -1017,6 +1018,8 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
 	struct rte_timer *tim, *tims[nb_evtims];
 	uint64_t cycles;
 	int n_lcores;
+	/* Timer list for this lcore is not in use. */
+	uint16_t exp_state = 0;
 
 #ifdef RTE_LIBRTE_EVENTDEV_DEBUG
 	/* Check that the service is running. */
@@ -1035,8 +1038,12 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
 	/* If this is the first time we're arming an event timer on this lcore,
 	 * mark this lcore as "in use"; this will cause the service
 	 * function to process the timer list that corresponds to this lcore.
+	 * The atomic compare-and-swap operation can prevent the race condition
+	 * on in_use flag between multiple non-EAL threads.
 	 */
-	if (unlikely(rte_atomic16_test_and_set(&sw->in_use[lcore_id].v))) {
+	if (unlikely(__atomic_compare_exchange_n(&sw->in_use[lcore_id].v,
+			&exp_state, 1, 0,
+			__ATOMIC_RELAXED, __ATOMIC_RELAXED))) {
 		EVTIM_LOG_DBG("Adding lcore id = %u to list of lcores to poll",
 			      lcore_id);
 		n_lcores = __atomic_fetch_add(&sw->n_poll_lcores, 1,
-- 
2.7.4


^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH v4 3/4] eventdev: remove redundant code
  2020-07-07 15:54     ` [dpdk-dev] [PATCH v4 1/4] eventdev: fix race condition on timer list counter Phil Yang
  2020-07-07 15:54       ` [dpdk-dev] [PATCH v4 2/4] eventdev: use C11 atomics for lcore timer armed flag Phil Yang
@ 2020-07-07 15:54       ` Phil Yang
  2020-07-07 15:54       ` [dpdk-dev] [PATCH v4 4/4] eventdev: relax smp barriers with C11 atomics Phil Yang
  2020-07-08 13:30       ` [dpdk-dev] [PATCH v4 1/4] eventdev: fix race condition on timer list counter Jerin Jacob
  3 siblings, 0 replies; 45+ messages in thread
From: Phil Yang @ 2020-07-07 15:54 UTC (permalink / raw)
  To: jerinj, dev
  Cc: thomas, erik.g.carrillo, Honnappa.Nagarahalli, drc, Ruifeng.Wang,
	Dharmik.Thakkar, nd, david.marchand, mdr, nhorman, dodji, stable

There is no thread will access these impl_opaque data after timer
canceled. When new timer armed, it got refilled. So the cleanup
process is unnecessary.

Cc: stable@dpdk.org

Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
---
v4:
Cc to stable release. (Honnappa)

 lib/librte_eventdev/rte_event_timer_adapter.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/lib/librte_eventdev/rte_event_timer_adapter.c b/lib/librte_eventdev/rte_event_timer_adapter.c
index 4ed3013..aa01b4d 100644
--- a/lib/librte_eventdev/rte_event_timer_adapter.c
+++ b/lib/librte_eventdev/rte_event_timer_adapter.c
@@ -1167,8 +1167,6 @@ swtim_cancel_burst(const struct rte_event_timer_adapter *adapter,
 		rte_mempool_put(sw->tim_pool, (void **)timp);
 
 		evtims[i]->state = RTE_EVENT_TIMER_CANCELED;
-		evtims[i]->impl_opaque[0] = 0;
-		evtims[i]->impl_opaque[1] = 0;
 
 		rte_smp_wmb();
 	}
-- 
2.7.4


^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH v4 4/4] eventdev: relax smp barriers with C11 atomics
  2020-07-07 15:54     ` [dpdk-dev] [PATCH v4 1/4] eventdev: fix race condition on timer list counter Phil Yang
  2020-07-07 15:54       ` [dpdk-dev] [PATCH v4 2/4] eventdev: use C11 atomics for lcore timer armed flag Phil Yang
  2020-07-07 15:54       ` [dpdk-dev] [PATCH v4 3/4] eventdev: remove redundant code Phil Yang
@ 2020-07-07 15:54       ` Phil Yang
  2020-07-08 13:30       ` [dpdk-dev] [PATCH v4 1/4] eventdev: fix race condition on timer list counter Jerin Jacob
  3 siblings, 0 replies; 45+ messages in thread
From: Phil Yang @ 2020-07-07 15:54 UTC (permalink / raw)
  To: jerinj, dev
  Cc: thomas, erik.g.carrillo, Honnappa.Nagarahalli, drc, Ruifeng.Wang,
	Dharmik.Thakkar, nd, david.marchand, mdr, nhorman, dodji, stable

The impl_opaque field is shared between the timer arm and cancel
operations. Meanwhile, the state flag acts as a guard variable to
make sure the update of impl_opaque is synchronized. The original
code uses rte_smp barriers to achieve that. This patch uses C11
atomics with an explicit one-way memory barrier instead of full
barriers rte_smp_w/rmb() to avoid the unnecessary barrier on aarch64.

Since compilers can generate the same instructions for volatile and
non-volatile variable in C11 __atomics built-ins, so remain the volatile
keyword in front of state enum to avoid the ABI break issue.

Cc: stable@dpdk.org

Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
---
v4:
1. Fix typo.
2. Cc to stable release. (Honnappa)

v3:
Fix ABI issue: revert to 'volatile enum rte_event_timer_state type state'.

v2:
1. Removed implementation-specific opaque data cleanup code.
2. Replaced thread fence with atomic ACQURE/RELEASE ordering on state access.

 lib/librte_eventdev/rte_event_timer_adapter.c | 55 ++++++++++++++++++---------
 1 file changed, 37 insertions(+), 18 deletions(-)

diff --git a/lib/librte_eventdev/rte_event_timer_adapter.c b/lib/librte_eventdev/rte_event_timer_adapter.c
index aa01b4d..4c5e49e 100644
--- a/lib/librte_eventdev/rte_event_timer_adapter.c
+++ b/lib/librte_eventdev/rte_event_timer_adapter.c
@@ -629,7 +629,8 @@ swtim_callback(struct rte_timer *tim)
 		sw->expired_timers[sw->n_expired_timers++] = tim;
 		sw->stats.evtim_exp_count++;
 
-		evtim->state = RTE_EVENT_TIMER_NOT_ARMED;
+		__atomic_store_n(&evtim->state, RTE_EVENT_TIMER_NOT_ARMED,
+				__ATOMIC_RELEASE);
 	}
 
 	if (event_buffer_batch_ready(&sw->buffer)) {
@@ -1020,6 +1021,7 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
 	int n_lcores;
 	/* Timer list for this lcore is not in use. */
 	uint16_t exp_state = 0;
+	enum rte_event_timer_state n_state;
 
 #ifdef RTE_LIBRTE_EVENTDEV_DEBUG
 	/* Check that the service is running. */
@@ -1060,30 +1062,36 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
 	}
 
 	for (i = 0; i < nb_evtims; i++) {
-		/* Don't modify the event timer state in these cases */
-		if (evtims[i]->state == RTE_EVENT_TIMER_ARMED) {
+		n_state = __atomic_load_n(&evtims[i]->state, __ATOMIC_ACQUIRE);
+		if (n_state == RTE_EVENT_TIMER_ARMED) {
 			rte_errno = EALREADY;
 			break;
-		} else if (!(evtims[i]->state == RTE_EVENT_TIMER_NOT_ARMED ||
-			     evtims[i]->state == RTE_EVENT_TIMER_CANCELED)) {
+		} else if (!(n_state == RTE_EVENT_TIMER_NOT_ARMED ||
+			     n_state == RTE_EVENT_TIMER_CANCELED)) {
 			rte_errno = EINVAL;
 			break;
 		}
 
 		ret = check_timeout(evtims[i], adapter);
 		if (unlikely(ret == -1)) {
-			evtims[i]->state = RTE_EVENT_TIMER_ERROR_TOOLATE;
+			__atomic_store_n(&evtims[i]->state,
+					RTE_EVENT_TIMER_ERROR_TOOLATE,
+					__ATOMIC_RELAXED);
 			rte_errno = EINVAL;
 			break;
 		} else if (unlikely(ret == -2)) {
-			evtims[i]->state = RTE_EVENT_TIMER_ERROR_TOOEARLY;
+			__atomic_store_n(&evtims[i]->state,
+					RTE_EVENT_TIMER_ERROR_TOOEARLY,
+					__ATOMIC_RELAXED);
 			rte_errno = EINVAL;
 			break;
 		}
 
 		if (unlikely(check_destination_event_queue(evtims[i],
 							   adapter) < 0)) {
-			evtims[i]->state = RTE_EVENT_TIMER_ERROR;
+			__atomic_store_n(&evtims[i]->state,
+					RTE_EVENT_TIMER_ERROR,
+					__ATOMIC_RELAXED);
 			rte_errno = EINVAL;
 			break;
 		}
@@ -1099,13 +1107,18 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
 					  SINGLE, lcore_id, NULL, evtims[i]);
 		if (ret < 0) {
 			/* tim was in RUNNING or CONFIG state */
-			evtims[i]->state = RTE_EVENT_TIMER_ERROR;
+			__atomic_store_n(&evtims[i]->state,
+					RTE_EVENT_TIMER_ERROR,
+					__ATOMIC_RELEASE);
 			break;
 		}
 
-		rte_smp_wmb();
 		EVTIM_LOG_DBG("armed an event timer");
-		evtims[i]->state = RTE_EVENT_TIMER_ARMED;
+		/* RELEASE ordering guarantees the adapter specific value
+		 * changes observed before the update of state.
+		 */
+		__atomic_store_n(&evtims[i]->state, RTE_EVENT_TIMER_ARMED,
+				__ATOMIC_RELEASE);
 	}
 
 	if (i < nb_evtims)
@@ -1132,6 +1145,7 @@ swtim_cancel_burst(const struct rte_event_timer_adapter *adapter,
 	struct rte_timer *timp;
 	uint64_t opaque;
 	struct swtim *sw = swtim_pmd_priv(adapter);
+	enum rte_event_timer_state n_state;
 
 #ifdef RTE_LIBRTE_EVENTDEV_DEBUG
 	/* Check that the service is running. */
@@ -1143,16 +1157,18 @@ swtim_cancel_burst(const struct rte_event_timer_adapter *adapter,
 
 	for (i = 0; i < nb_evtims; i++) {
 		/* Don't modify the event timer state in these cases */
-		if (evtims[i]->state == RTE_EVENT_TIMER_CANCELED) {
+		/* ACQUIRE ordering guarantees the access of implementation
+		 * specific opaque data under the correct state.
+		 */
+		n_state = __atomic_load_n(&evtims[i]->state, __ATOMIC_ACQUIRE);
+		if (n_state == RTE_EVENT_TIMER_CANCELED) {
 			rte_errno = EALREADY;
 			break;
-		} else if (evtims[i]->state != RTE_EVENT_TIMER_ARMED) {
+		} else if (n_state != RTE_EVENT_TIMER_ARMED) {
 			rte_errno = EINVAL;
 			break;
 		}
 
-		rte_smp_rmb();
-
 		opaque = evtims[i]->impl_opaque[0];
 		timp = (struct rte_timer *)(uintptr_t)opaque;
 		RTE_ASSERT(timp != NULL);
@@ -1166,9 +1182,12 @@ swtim_cancel_burst(const struct rte_event_timer_adapter *adapter,
 
 		rte_mempool_put(sw->tim_pool, (void **)timp);
 
-		evtims[i]->state = RTE_EVENT_TIMER_CANCELED;
-
-		rte_smp_wmb();
+		/* The RELEASE ordering here pairs with atomic ordering
+		 * to make sure the state update data observed between
+		 * threads.
+		 */
+		__atomic_store_n(&evtims[i]->state, RTE_EVENT_TIMER_CANCELED,
+				__ATOMIC_RELEASE);
 	}
 
 	return i;
-- 
2.7.4


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v3 4/4] eventdev: relax smp barriers with C11 atomics
  2020-07-07 14:29       ` Jerin Jacob
@ 2020-07-07 15:56         ` Phil Yang
  0 siblings, 0 replies; 45+ messages in thread
From: Phil Yang @ 2020-07-07 15:56 UTC (permalink / raw)
  To: Jerin Jacob
  Cc: thomas, Erik Gabriel Carrillo, dpdk-dev, jerinj,
	Honnappa Nagarahalli, David Christensen, Ruifeng Wang,
	Dharmik Thakkar, nd, David Marchand, Ray Kinsella, Neil Horman,
	dodji

> -----Original Message-----
> From: Jerin Jacob <jerinjacobk@gmail.com>
> Sent: Tuesday, July 7, 2020 10:30 PM
> To: Phil Yang <Phil.Yang@arm.com>
> Cc: thomas@monjalon.net; Erik Gabriel Carrillo <erik.g.carrillo@intel.com>;
> dpdk-dev <dev@dpdk.org>; jerinj@marvell.com; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; David Christensen
> <drc@linux.vnet.ibm.com>; Ruifeng Wang <Ruifeng.Wang@arm.com>;
> Dharmik Thakkar <Dharmik.Thakkar@arm.com>; nd <nd@arm.com>; David
> Marchand <david.marchand@redhat.com>; Ray Kinsella <mdr@ashroe.eu>;
> Neil Horman <nhorman@tuxdriver.com>; dodji@redhat.com
> Subject: Re: [dpdk-dev] [PATCH v3 4/4] eventdev: relax smp barriers with
> C11 atomics
> 
> On Tue, Jul 7, 2020 at 4:45 PM Phil Yang <phil.yang@arm.com> wrote:
> >
> > The impl_opaque field is shared between the timer arm and cancel
> > operations. Meanwhile, the state flag acts as a guard variable to
> > make sure the update of impl_opaque is synchronized. The original
> > code uses rte_smp barriers to achieve that. This patch uses C11
> > atomics with an explicit one-way memory barrier instead of full
> > barriers rte_smp_w/rmb() to avoid the unnecessary barrier on aarch64.
> >
> > Since compilers can generate the same instructions for volatile and
> > non-volatile variable in C11 __atomics built-ins, so remain the volatile
> > keyword in front of state enum to avoid the ABI break issue.
> >
> > Signed-off-by: Phil Yang <phil.yang@arm.com>
> > Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
> > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
> 
> 
> Could you fix the following:
> 
> WARNING:TYPO_SPELLING: 'opague' may be misspelled - perhaps 'opaque'?
> #184: FILE: lib/librte_eventdev/rte_event_timer_adapter.c:1161:
> + * specific opague data under the correct state.
Done. 

Thanks,
Phil

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v4 1/4] eventdev: fix race condition on timer list counter
  2020-07-07 15:54     ` [dpdk-dev] [PATCH v4 1/4] eventdev: fix race condition on timer list counter Phil Yang
                         ` (2 preceding siblings ...)
  2020-07-07 15:54       ` [dpdk-dev] [PATCH v4 4/4] eventdev: relax smp barriers with C11 atomics Phil Yang
@ 2020-07-08 13:30       ` Jerin Jacob
  2020-07-08 15:01         ` Thomas Monjalon
  3 siblings, 1 reply; 45+ messages in thread
From: Jerin Jacob @ 2020-07-08 13:30 UTC (permalink / raw)
  To: Phil Yang
  Cc: Jerin Jacob, dpdk-dev, Thomas Monjalon, Erik Gabriel Carrillo,
	Honnappa Nagarahalli, David Christensen,
	Ruifeng Wang (Arm Technology China),
	Dharmik Thakkar, nd, David Marchand, Ray Kinsella, Neil Horman,
	dodji, dpdk stable

On Tue, Jul 7, 2020 at 9:25 PM Phil Yang <phil.yang@arm.com> wrote:
>
> The n_poll_lcores counter and poll_lcore array are shared between lcores
> and the update of these variables are out of the protection of spinlock
> on each lcore timer list. The read-modify-write operations of the counter
> are not atomic, so it has the potential of race condition between lcores.
>
> Use c11 atomics with RELAXED ordering to prevent confliction.
>
> Fixes: cc7b73ea9e3b ("eventdev: add new software timer adapter")
> Cc: erik.g.carrillo@intel.com
> Cc: stable@dpdk.org
>
> Signed-off-by: Phil Yang <phil.yang@arm.com>
> Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>

Hi Thomas,

The latest version does not have ABI breakage issue.

I have added the ABI verifier in my local patch verification setup.

Series applied to dpdk-next-eventdev/master.

Please pull this series from dpdk-next-eventdev/master. Thanks.

I am marking this patch series as "Awaiting Upstream" in patchwork
status to reflect the actual status.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v4 1/4] eventdev: fix race condition on timer list counter
  2020-07-08 13:30       ` [dpdk-dev] [PATCH v4 1/4] eventdev: fix race condition on timer list counter Jerin Jacob
@ 2020-07-08 15:01         ` Thomas Monjalon
  0 siblings, 0 replies; 45+ messages in thread
From: Thomas Monjalon @ 2020-07-08 15:01 UTC (permalink / raw)
  To: Phil Yang, Jerin Jacob
  Cc: dev, Erik Gabriel Carrillo, Honnappa Nagarahalli,
	David Christensen, Ruifeng Wang (Arm Technology China),
	Dharmik Thakkar, nd, David Marchand, Ray Kinsella, Neil Horman,
	dodji, dpdk stable, Jerin Jacob

08/07/2020 15:30, Jerin Jacob:
> On Tue, Jul 7, 2020 at 9:25 PM Phil Yang <phil.yang@arm.com> wrote:
> >
> > The n_poll_lcores counter and poll_lcore array are shared between lcores
> > and the update of these variables are out of the protection of spinlock
> > on each lcore timer list. The read-modify-write operations of the counter
> > are not atomic, so it has the potential of race condition between lcores.
> >
> > Use c11 atomics with RELAXED ordering to prevent confliction.
> >
> > Fixes: cc7b73ea9e3b ("eventdev: add new software timer adapter")
> > Cc: erik.g.carrillo@intel.com
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Phil Yang <phil.yang@arm.com>
> > Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
> > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
> 
> Hi Thomas,
> 
> The latest version does not have ABI breakage issue.
> 
> I have added the ABI verifier in my local patch verification setup.
> 
> Series applied to dpdk-next-eventdev/master.
> 
> Please pull this series from dpdk-next-eventdev/master. Thanks.
> 
> I am marking this patch series as "Awaiting Upstream" in patchwork
> status to reflect the actual status.

OK, pulled and marked as Accepted in patchwork.



^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2020-07-08 15:01 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-12 11:19 [dpdk-dev] [PATCH 1/3] eventdev: fix race condition on timer list counter Phil Yang
2020-06-12 11:19 ` [dpdk-dev] [PATCH 2/3] eventdev: use c11 atomics for lcore timer armed flag Phil Yang
2020-06-23 21:01   ` Carrillo, Erik G
2020-06-28 16:12     ` Phil Yang
2020-06-23 21:20   ` Stephen Hemminger
2020-06-23 21:31     ` Carrillo, Erik G
2020-06-28 16:32       ` Phil Yang
2020-06-12 11:19 ` [dpdk-dev] [PATCH 3/3] eventdev: relax smp barriers with c11 atomics Phil Yang
2020-06-22 10:12   ` Phil Yang
2020-06-23 19:38     ` Carrillo, Erik G
2020-06-28 17:33       ` Phil Yang
2020-06-29 18:07         ` Carrillo, Erik G
2020-06-18 15:17 ` [dpdk-dev] [PATCH 1/3] eventdev: fix race condition on timer list counter Carrillo, Erik G
2020-06-18 18:25   ` Honnappa Nagarahalli
2020-06-22  9:48     ` Phil Yang
2020-07-01 11:22       ` Jerin Jacob
2020-07-02  3:28         ` Phil Yang
2020-07-02  3:26     ` Phil Yang
2020-07-02  3:56       ` Honnappa Nagarahalli
2020-07-02 21:15         ` Carrillo, Erik G
2020-07-02 21:30           ` Honnappa Nagarahalli
2020-06-22  9:09   ` Phil Yang
2020-07-02  5:26 ` [dpdk-dev] [PATCH v2 1/4] " Phil Yang
2020-07-02  5:26   ` [dpdk-dev] [PATCH v2 2/4] eventdev: use c11 atomics for lcore timer armed flag Phil Yang
2020-07-02 20:21     ` Carrillo, Erik G
2020-07-02  5:26   ` [dpdk-dev] [PATCH v2 3/4] eventdev: remove redundant code Phil Yang
2020-07-03  3:35     ` Dharmik Thakkar
2020-07-02  5:26   ` [dpdk-dev] [PATCH v2 4/4] eventdev: relax smp barriers with c11 atomics Phil Yang
2020-07-02 20:30     ` Carrillo, Erik G
2020-07-03 10:50       ` Jerin Jacob
2020-07-06 10:04     ` Thomas Monjalon
2020-07-06 15:32       ` Phil Yang
2020-07-06 15:40         ` Thomas Monjalon
2020-07-07 11:13   ` [dpdk-dev] [PATCH v3 1/4] eventdev: fix race condition on timer list counter Phil Yang
2020-07-07 11:13     ` [dpdk-dev] [PATCH v3 2/4] eventdev: use C11 atomics for lcore timer armed flag Phil Yang
2020-07-07 11:13     ` [dpdk-dev] [PATCH v3 3/4] eventdev: remove redundant code Phil Yang
2020-07-07 11:13     ` [dpdk-dev] [PATCH v3 4/4] eventdev: relax smp barriers with C11 atomics Phil Yang
2020-07-07 14:29       ` Jerin Jacob
2020-07-07 15:56         ` Phil Yang
2020-07-07 15:54     ` [dpdk-dev] [PATCH v4 1/4] eventdev: fix race condition on timer list counter Phil Yang
2020-07-07 15:54       ` [dpdk-dev] [PATCH v4 2/4] eventdev: use C11 atomics for lcore timer armed flag Phil Yang
2020-07-07 15:54       ` [dpdk-dev] [PATCH v4 3/4] eventdev: remove redundant code Phil Yang
2020-07-07 15:54       ` [dpdk-dev] [PATCH v4 4/4] eventdev: relax smp barriers with C11 atomics Phil Yang
2020-07-08 13:30       ` [dpdk-dev] [PATCH v4 1/4] eventdev: fix race condition on timer list counter Jerin Jacob
2020-07-08 15:01         ` Thomas Monjalon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).