From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <harry.van.haaren@intel.com>
Received: from mga04.intel.com (mga04.intel.com [192.55.52.120])
 by dpdk.org (Postfix) with ESMTP id D4CB71B1DD
 for <dev@dpdk.org>; Tue,  9 Jan 2018 14:37:50 +0100 (CET)
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from fmsmga001.fm.intel.com ([10.253.24.23])
 by fmsmga104.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
 09 Jan 2018 05:37:50 -0800
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.46,335,1511856000"; d="scan'208";a="20165277"
Received: from silpixa00398672.ir.intel.com ([10.237.223.128])
 by fmsmga001.fm.intel.com with ESMTP; 09 Jan 2018 05:37:49 -0800
From: Harry van Haaren <harry.van.haaren@intel.com>
To: dev@dpdk.org
Cc: pbhagavatula@caviumnetworks.com, bruce.richardson@intel.com,
 Harry van Haaren <harry.van.haaren@intel.com>
Date: Tue,  9 Jan 2018 13:37:41 +0000
Message-Id: <1515505061-12112-2-git-send-email-harry.van.haaren@intel.com>
X-Mailer: git-send-email 2.7.4
In-Reply-To: <1515505061-12112-1-git-send-email-harry.van.haaren@intel.com>
References: <1515497885-191922-1-git-send-email-harry.van.haaren@intel.com>
 <1515505061-12112-1-git-send-email-harry.van.haaren@intel.com>
Subject: [dpdk-dev] [PATCH v4 2/2] service: fix service core launch
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Tue, 09 Jan 2018 13:37:51 -0000

This patch fixes a potential bug, which was not consistently
showing up in the unit tests. The issue was that the service-
lcore being started was not in a "WAIT" state, and hence EAL
would return -EBUSY instead of launching the lcore.

In order to ensure a core is in a launch-ready state, the application
must call rte_eal_wait_lcore, to ensure that the core has completed
its previous task, and that EAL is ready to re-launch it.

The call to rte_eal_wait_lcore() is explicitly not in the
service core function, to make it visible to the application.
Requiring an explicit function call ensures the developer sees
that a lcore could block in the rte_eal_wait_lcore() function
if the core hasn't returned from its previous function.

>>From a usability perspective, hiding the wait_lcore() inside
service cores would cause confusion.

This patch adds rte_eal_wait_lcore() calls to the unit tests,
to ensure that the lcores for testing functionality are ready
to run the test.

Fixes: 21698354c832 ("service: introduce service cores concept")
+CC stable@dpdk.org

Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>

---

v3, v4:
- No changes

v2:
- Increased delay time, as certain systems could fail intermittently
  due to the thread not being spawned before delay was over (Pavan)
- Added rte_eal_wait_lcore() on service cores to ensure cores are
  ready state before re-running test with new parameters (Pavan)

@Stable maintainers; this is an EXPERIMENTAL tagged API, so
I'm not sure what the expectation is in terms of backporting.
---
 lib/librte_eal/common/include/rte_service.h |  4 +++-
 test/test/test_service_cores.c              | 10 +++++++++-
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/include/rte_service.h b/lib/librte_eal/common/include/rte_service.h
index 5a76383..95def4c 100644
--- a/lib/librte_eal/common/include/rte_service.h
+++ b/lib/librte_eal/common/include/rte_service.h
@@ -246,7 +246,9 @@ int32_t rte_service_run_iter_on_app_lcore(uint32_t id,
  * Start a service core.
  *
  * Starting a core makes the core begin polling. Any services assigned to it
- * will be run as fast as possible.
+ * will be run as fast as possible. The application must ensure that the lcore
+ * is in a launchable state: e.g. call *rte_eal_lcore_wait* on the lcore_id
+ * before calling this function.
  *
  * @retval 0 Success
  * @retval -EINVAL Failed to start core. The *lcore_id* passed in is not
diff --git a/test/test/test_service_cores.c b/test/test/test_service_cores.c
index 7d09f5c..2972a80 100644
--- a/test/test/test_service_cores.c
+++ b/test/test/test_service_cores.c
@@ -320,6 +320,7 @@ service_lcore_en_dis_able(void)
 
 	/* call remote_launch to verify that app can launch ex-service lcore */
 	service_remote_launch_flag = 0;
+	rte_eal_wait_lcore(slcore_id);
 	int ret = rte_eal_remote_launch(service_remote_launch_func, NULL,
 					slcore_id);
 	TEST_ASSERT_EQUAL(0, ret, "Ex-service core remote launch failed.");
@@ -334,7 +335,7 @@ static int
 service_lcore_running_check(void)
 {
 	uint64_t tick = service_tick;
-	rte_delay_ms(SERVICE_DELAY * 10);
+	rte_delay_ms(SERVICE_DELAY * 100);
 	/* if (tick != service_tick) we know the lcore as polled the service */
 	return tick != service_tick;
 }
@@ -477,6 +478,10 @@ service_threaded_test(int mt_safe)
 	if (!mt_safe)
 		test_params[1] = 1;
 
+	/* wait for lcores before start() */
+	rte_eal_wait_lcore(slcore_1);
+	rte_eal_wait_lcore(slcore_2);
+
 	rte_service_lcore_start(slcore_1);
 	rte_service_lcore_start(slcore_2);
 
@@ -490,6 +495,8 @@ service_threaded_test(int mt_safe)
 	TEST_ASSERT_EQUAL(0, rte_service_runstate_set(sid, 0),
 			"Failed to stop MT Safe service");
 
+	rte_eal_wait_lcore(slcore_1);
+	rte_eal_wait_lcore(slcore_2);
 	unregister_all();
 
 	/* return the value of the callback pass_test variable to caller */
@@ -583,6 +590,7 @@ service_app_lcore_poll_impl(const int mt_safe)
 	rte_service_runstate_set(id, 1);
 
 	uint32_t app_core2 = rte_get_next_lcore(slcore_id, 1, 1);
+	rte_eal_wait_lcore(app_core2);
 	int app_core2_ret = rte_eal_remote_launch(service_run_on_app_core_func,
 						  &id, app_core2);
 
-- 
2.7.4