From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <jianfeng.tan@intel.com>
Received: from mga11.intel.com (mga11.intel.com [192.55.52.93])
 by dpdk.org (Postfix) with ESMTP id AE4B7A498
 for <dev@dpdk.org>; Fri, 27 Apr 2018 18:39:35 +0200 (CEST)
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from orsmga004.jf.intel.com ([10.7.209.38])
 by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
 27 Apr 2018 09:39:33 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.49,335,1520924400"; d="scan'208";a="194920767"
Received: from dpdk06.sh.intel.com ([10.67.110.196])
 by orsmga004.jf.intel.com with ESMTP; 27 Apr 2018 09:39:33 -0700
From: Jianfeng Tan <jianfeng.tan@intel.com>
To: dev@dpdk.org
Cc: thomas@monjalon.net, Jianfeng Tan <jianfeng.tan@intel.com>,
 Olivier Matz <olivier.matz@6wind.com>,
 Anatoly Burakov <anatoly.burakov@intel.com>
Date: Fri, 27 Apr 2018 16:41:42 +0000
Message-Id: <1524847302-88110-1-git-send-email-jianfeng.tan@intel.com>
X-Mailer: git-send-email 2.7.4
Subject: [dpdk-dev] [PATCH] eal: fix threads block on barrier
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Fri, 27 Apr 2018 16:39:36 -0000

Below commit introduced pthread barrier for synchronization.
But two IPC threads block on the barrier, and never wake up.

  (gdb) bt
  #0  futex_wait (private=0, expected=0, futex_word=0x7fffffffcff4)
      at ../sysdeps/unix/sysv/linux/futex-internal.h:61
  #1  futex_wait_simple (private=0, expected=0, futex_word=0x7fffffffcff4)
      at ../sysdeps/nptl/futex-internal.h:135
  #2  __pthread_barrier_wait (barrier=0x7fffffffcff0) at pthread_barrier_wait.c:184
  #3  rte_thread_init (arg=0x7fffffffcfe0)
      at ../dpdk/lib/librte_eal/common/eal_common_thread.c:160
  #4  start_thread (arg=0x7ffff6ecf700) at pthread_create.c:333
  #5  clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Through analysis, we find the barrier defined on the stack could be the
root cause. This patch will change to use heap memory as the barrier.

Fixes: d651ee4919cd ("eal: set affinity for control threads")

Cc: Olivier Matz <olivier.matz@6wind.com>
Cc: Anatoly Burakov <anatoly.burakov@intel.com>

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
---
 lib/librte_eal/common/eal_common_thread.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_thread.c b/lib/librte_eal/common/eal_common_thread.c
index 4e75cb8..da2b84f 100644
--- a/lib/librte_eal/common/eal_common_thread.c
+++ b/lib/librte_eal/common/eal_common_thread.c
@@ -166,17 +166,21 @@ rte_ctrl_thread_create(pthread_t *thread, const char *name,
 		const pthread_attr_t *attr,
 		void *(*start_routine)(void *), void *arg)
 {
-	struct rte_thread_ctrl_params params = {
-		.start_routine = start_routine,
-		.arg = arg,
-	};
+	struct rte_thread_ctrl_params *params;
 	unsigned int lcore_id;
 	rte_cpuset_t cpuset;
 	int cpu_found, ret;
 
-	pthread_barrier_init(&params.configured, NULL, 2);
+	params = malloc(sizeof(*params));
+	if (!params)
+		return -1;
+
+	params->start_routine = start_routine;
+	params->arg = arg;
 
-	ret = pthread_create(thread, attr, rte_thread_init, (void *)&params);
+	pthread_barrier_init(&params->configured, NULL, 2);
+
+	ret = pthread_create(thread, attr, rte_thread_init, (void *)params);
 	if (ret != 0)
 		return ret;
 
@@ -203,12 +207,14 @@ rte_ctrl_thread_create(pthread_t *thread, const char *name,
 	if (ret < 0)
 		goto fail;
 
-	pthread_barrier_wait(&params.configured);
+	pthread_barrier_wait(&params->configured);
+	free(params);
 
 	return 0;
 
 fail:
 	pthread_cancel(*thread);
 	pthread_join(*thread, NULL);
+	free(params);
 	return ret;
 }
-- 
2.7.4