* [dpdk-dev] [PATCH 1/2] service: fix del to reset lcore role to rte @ 2017-12-20 11:21 Harry van Haaren 2017-12-20 11:21 ` [dpdk-dev] [PATCH 2/2] service: fix service core launch Harry van Haaren ` (2 more replies) 0 siblings, 3 replies; 13+ messages in thread From: Harry van Haaren @ 2017-12-20 11:21 UTC (permalink / raw) To: dev; +Cc: Harry van Haaren This patch fixes the reset of the service core, that when rte_service_lcore_del() is called, the lcore_role is restored to RTE. This issue was reported as when running the unit tests, an error was thrown that "failed to allocate lcore". Investigating revealed that the state of the service-cores after del() was not allowing a core to be re-used at a later point in time. Fixes: 21698354c832 ("service: introduce service cores concept") +CC stable@dpdk.org Reported-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com> Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> --- @Stable maintainers; this is an EXPERIMENTAL tagged API, so I'm not sure what the expectation is in terms of backporting. --- lib/librte_eal/common/rte_service.c | 30 +++++++++++++++--------------- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/lib/librte_eal/common/rte_service.c b/lib/librte_eal/common/rte_service.c index ae97e6b..15f63e7 100644 --- a/lib/librte_eal/common/rte_service.c +++ b/lib/librte_eal/common/rte_service.c @@ -583,13 +583,27 @@ rte_service_map_lcore_get(uint32_t id, uint32_t lcore) return ret; } +static void +set_lcore_state(uint32_t lcore, int32_t state) +{ + /* mark core state in hugepage backed config */ + struct rte_config *cfg = rte_eal_get_configuration(); + cfg->lcore_role[lcore] = state; + + /* mark state in process local lcore_config */ + lcore_config[lcore].core_role = state; + + /* update per-lcore optimized state tracking */ + lcore_states[lcore].is_service_core = (state == ROLE_SERVICE); +} + int32_t rte_service_lcore_reset_all(void) { /* loop over cores, reset all to mask 0 */ uint32_t i; for (i = 0; i < RTE_MAX_LCORE; i++) { lcore_states[i].service_mask = 0; - lcore_states[i].is_service_core = 0; + set_lcore_state(i, ROLE_RTE); lcore_states[i].runstate = RUNSTATE_STOPPED; } for (i = 0; i < RTE_SERVICE_NUM_MAX; i++) @@ -600,20 +614,6 @@ int32_t rte_service_lcore_reset_all(void) return 0; } -static void -set_lcore_state(uint32_t lcore, int32_t state) -{ - /* mark core state in hugepage backed config */ - struct rte_config *cfg = rte_eal_get_configuration(); - cfg->lcore_role[lcore] = state; - - /* mark state in process local lcore_config */ - lcore_config[lcore].core_role = state; - - /* update per-lcore optimized state tracking */ - lcore_states[lcore].is_service_core = (state == ROLE_SERVICE); -} - int32_t rte_service_lcore_add(uint32_t lcore) { -- 2.7.4 ^ permalink raw reply [flat|nested] 13+ messages in thread
* [dpdk-dev] [PATCH 2/2] service: fix service core launch 2017-12-20 11:21 [dpdk-dev] [PATCH 1/2] service: fix del to reset lcore role to rte Harry van Haaren @ 2017-12-20 11:21 ` Harry van Haaren 2018-01-04 15:30 ` Pavan Nikhilesh 2018-01-04 15:20 ` [dpdk-dev] [PATCH 1/2] service: fix del to reset lcore role to rte Pavan Nikhilesh 2018-01-08 15:58 ` [dpdk-dev] [PATCH v2 " Harry van Haaren 2 siblings, 1 reply; 13+ messages in thread From: Harry van Haaren @ 2017-12-20 11:21 UTC (permalink / raw) To: dev; +Cc: Harry van Haaren This patch fixes a potential bug, which was not consistently showing up in the unit tests. The issue was that the service- lcore being started was not in a "WAIT" state, and hence EAL would return -EBUSY instead of launching the lcore. In order to ensure a core is in a launch-ready state, the application must call rte_eal_wait_lcore, to ensure that the core has completed its previous task, and that EAL is ready to re-launch it. The call to rte_eal_wait_lcore() is explicitly not in the service core function, to make it visible to the application. Requiring an explicit function call ensures the developer sees that a lcore could block in the rte_eal_wait_lcore() function if the core hasn't returned from its previous function. >From a usability perspective, hiding the wait_lcore() inside service cores would cause confusion. This patch adds rte_eal_wait_lcore() calls to the unit tests, to ensure that the lcores for testing functionality are ready to run the test. Fixes: 21698354c832 ("service: introduce service cores concept") +CC stable@dpdk.org Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> --- @Stable maintainers; this is an EXPERIMENTAL tagged API, so I'm not sure what the expectation is in terms of backporting. --- lib/librte_eal/common/include/rte_service.h | 4 +++- test/test/test_service_cores.c | 6 ++++++ 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/lib/librte_eal/common/include/rte_service.h b/lib/librte_eal/common/include/rte_service.h index 9272440..495b531 100644 --- a/lib/librte_eal/common/include/rte_service.h +++ b/lib/librte_eal/common/include/rte_service.h @@ -274,7 +274,9 @@ int32_t rte_service_run_iter_on_app_lcore(uint32_t id, * Start a service core. * * Starting a core makes the core begin polling. Any services assigned to it - * will be run as fast as possible. + * will be run as fast as possible. The application must ensure that the lcore + * is in a launchable state: e.g. call *rte_eal_lcore_wait* on the lcore_id + * before calling this function. * * @retval 0 Success * @retval -EINVAL Failed to start core. The *lcore_id* passed in is not diff --git a/test/test/test_service_cores.c b/test/test/test_service_cores.c index 311c704..43f2318 100644 --- a/test/test/test_service_cores.c +++ b/test/test/test_service_cores.c @@ -348,6 +348,7 @@ service_lcore_en_dis_able(void) /* call remote_launch to verify that app can launch ex-service lcore */ service_remote_launch_flag = 0; + rte_eal_wait_lcore(slcore_id); int ret = rte_eal_remote_launch(service_remote_launch_func, NULL, slcore_id); TEST_ASSERT_EQUAL(0, ret, "Ex-service core remote launch failed."); @@ -505,6 +506,10 @@ service_threaded_test(int mt_safe) if (!mt_safe) test_params[1] = 1; + /* wait for lcores before start() */ + rte_eal_wait_lcore(slcore_1); + rte_eal_wait_lcore(slcore_2); + rte_service_lcore_start(slcore_1); rte_service_lcore_start(slcore_2); @@ -611,6 +616,7 @@ service_app_lcore_poll_impl(const int mt_safe) rte_service_runstate_set(id, 1); uint32_t app_core2 = rte_get_next_lcore(slcore_id, 1, 1); + rte_eal_wait_lcore(app_core2); int app_core2_ret = rte_eal_remote_launch(service_run_on_app_core_func, &id, app_core2); -- 2.7.4 ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [dpdk-dev] [PATCH 2/2] service: fix service core launch 2017-12-20 11:21 ` [dpdk-dev] [PATCH 2/2] service: fix service core launch Harry van Haaren @ 2018-01-04 15:30 ` Pavan Nikhilesh 0 siblings, 0 replies; 13+ messages in thread From: Pavan Nikhilesh @ 2018-01-04 15:30 UTC (permalink / raw) To: Harry van Haaren; +Cc: dev Hi Harry, On Wed, Dec 20, 2017 at 11:21:47AM +0000, Harry van Haaren wrote: > diff --git a/test/test/test_service_cores.c b/test/test/test_service_cores.c > index 311c704..43f2318 100644 > --- a/test/test/test_service_cores.c > +++ b/test/test/test_service_cores.c > @@ -348,6 +348,7 @@ service_lcore_en_dis_able(void) > > /* call remote_launch to verify that app can launch ex-service lcore */ > service_remote_launch_flag = 0; > + rte_eal_wait_lcore(slcore_id); > int ret = rte_eal_remote_launch(service_remote_launch_func, NULL, > slcore_id); > TEST_ASSERT_EQUAL(0, ret, "Ex-service core remote launch failed."); > @@ -505,6 +506,10 @@ service_threaded_test(int mt_safe) > if (!mt_safe) > test_params[1] = 1; > > + /* wait for lcores before start() */ > + rte_eal_wait_lcore(slcore_1); > + rte_eal_wait_lcore(slcore_2); > + > rte_service_lcore_start(slcore_1); > rte_service_lcore_start(slcore_2); As you are touching this file can you change following things: Need to increase the delay to a value similar to other tc. service_lcore_running_check(void) { uint64_t tick = service_tick; - rte_delay_ms(SERVICE_DELAY * 10); + rte_delay_ms(100); /* if (tick != service_tick) we know the lcore as polled the service */ return tick != service_tick; } As service_mt_unsafe_poll and service_mt_safe_poll use the same function body and are called one after the other we need to wait for them to complete before proceeding to the next tc i.e service_mt_unsafe_poll -> wait for the cores to complete -> service_mt_safe_poll else it will lead to unintended side effects. @@ -523,6 +523,8 @@ service_threaded_test(int mt_safe) TEST_ASSERT_EQUAL(0, rte_service_runstate_set(sid, 0), "Failed to stop MT Safe service"); + rte_eal_wait_lcore(slcore_1); + rte_eal_wait_lcore(slcore_2); unregister_all(); /* return the value of the callback pass_test variable to caller */ Cheers, Pavan. > > @@ -611,6 +616,7 @@ service_app_lcore_poll_impl(const int mt_safe) > rte_service_runstate_set(id, 1); > > uint32_t app_core2 = rte_get_next_lcore(slcore_id, 1, 1); > + rte_eal_wait_lcore(app_core2); > int app_core2_ret = rte_eal_remote_launch(service_run_on_app_core_func, > &id, app_core2); > > -- > 2.7.4 > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [dpdk-dev] [PATCH 1/2] service: fix del to reset lcore role to rte 2017-12-20 11:21 [dpdk-dev] [PATCH 1/2] service: fix del to reset lcore role to rte Harry van Haaren 2017-12-20 11:21 ` [dpdk-dev] [PATCH 2/2] service: fix service core launch Harry van Haaren @ 2018-01-04 15:20 ` Pavan Nikhilesh 2018-01-08 15:58 ` [dpdk-dev] [PATCH v2 " Harry van Haaren 2 siblings, 0 replies; 13+ messages in thread From: Pavan Nikhilesh @ 2018-01-04 15:20 UTC (permalink / raw) To: Harry van Haaren; +Cc: dev Hi Harry, Comments inline. On Wed, Dec 20, 2017 at 11:21:46AM +0000, Harry van Haaren wrote: > This patch fixes the reset of the service core, > that when rte_service_lcore_del() is called, the > lcore_role is restored to RTE. > > This issue was reported as when running the unit tests, an > error was thrown that "failed to allocate lcore". Investigating > revealed that the state of the service-cores after del() was > not allowing a core to be re-used at a later point in time. > > Fixes: 21698354c832 ("service: introduce service cores concept") > +CC stable@dpdk.org > > Reported-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com> > Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> > > --- > > @Stable maintainers; this is an EXPERIMENTAL tagged API, so I'm > not sure what the expectation is in terms of backporting. > --- > lib/librte_eal/common/rte_service.c | 30 +++++++++++++++--------------- > 1 file changed, 15 insertions(+), 15 deletions(-) > <snip> > int32_t rte_service_lcore_reset_all(void) > { > /* loop over cores, reset all to mask 0 */ > uint32_t i; > for (i = 0; i < RTE_MAX_LCORE; i++) { > lcore_states[i].service_mask = 0; > - lcore_states[i].is_service_core = 0; > + set_lcore_state(i, ROLE_RTE); Setting ROLE_RTE for RTE_MAX_LCORE lcores is incorrect. There should be a check to set only service lcores something like this: for (i = 0; i < RTE_MAX_LCORE; i++) { - lcore_states[i].service_mask = 0; - set_lcore_state(i, ROLE_RTE); - lcore_states[i].runstate = RUNSTATE_STOPPED; + if (lcore_states[i].is_service_core) { + lcore_states[i].service_mask = 0; + set_lcore_state(i, ROLE_RTE); + lcore_states[i].runstate = RUNSTATE_STOPPED; + } Cheers, Pavan. > lcore_states[i].runstate = RUNSTATE_STOPPED; > } > for (i = 0; i < RTE_SERVICE_NUM_MAX; i++) > @@ -600,20 +614,6 @@ int32_t rte_service_lcore_reset_all(void) > return 0; > } > > int32_t > rte_service_lcore_add(uint32_t lcore) > { > -- > 2.7.4 > ^ permalink raw reply [flat|nested] 13+ messages in thread
* [dpdk-dev] [PATCH v2 1/2] service: fix del to reset lcore role to rte 2017-12-20 11:21 [dpdk-dev] [PATCH 1/2] service: fix del to reset lcore role to rte Harry van Haaren 2017-12-20 11:21 ` [dpdk-dev] [PATCH 2/2] service: fix service core launch Harry van Haaren 2018-01-04 15:20 ` [dpdk-dev] [PATCH 1/2] service: fix del to reset lcore role to rte Pavan Nikhilesh @ 2018-01-08 15:58 ` Harry van Haaren 2018-01-08 15:58 ` [dpdk-dev] [PATCH v2 2/2] service: fix service core launch Harry van Haaren 2018-01-09 11:38 ` [dpdk-dev] [PATCH v3 1/2] service: fix del to reset lcore role to rte Harry van Haaren 2 siblings, 2 replies; 13+ messages in thread From: Harry van Haaren @ 2018-01-08 15:58 UTC (permalink / raw) To: dev; +Cc: pbhagavatula, Harry van Haaren This patch fixes the reset of the service core, that when rte_service_lcore_del() is called, the lcore_role is restored to RTE. This issue was reported as when running the unit tests, an error was thrown that "failed to allocate lcore". Investigating revealed that the state of the service-cores after del() was not allowing a core to be re-used at a later point in time. Fixes: 21698354c832 ("service: introduce service cores concept") +CC stable@dpdk.org Reported-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com> Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> --- v2: - Only update state on service core ids (Pavan) @Stable maintainers; this is an EXPERIMENTAL tagged API, so I'm not sure what the expectation is in terms of backporting. --- lib/librte_eal/common/rte_service.c | 36 +++++++++++++++++++----------------- 1 file changed, 19 insertions(+), 17 deletions(-) diff --git a/lib/librte_eal/common/rte_service.c b/lib/librte_eal/common/rte_service.c index 372d0bb..5240811 100644 --- a/lib/librte_eal/common/rte_service.c +++ b/lib/librte_eal/common/rte_service.c @@ -554,23 +554,6 @@ rte_service_map_lcore_get(uint32_t id, uint32_t lcore) return ret; } -int32_t rte_service_lcore_reset_all(void) -{ - /* loop over cores, reset all to mask 0 */ - uint32_t i; - for (i = 0; i < RTE_MAX_LCORE; i++) { - lcore_states[i].service_mask = 0; - lcore_states[i].is_service_core = 0; - lcore_states[i].runstate = RUNSTATE_STOPPED; - } - for (i = 0; i < RTE_SERVICE_NUM_MAX; i++) - rte_atomic32_set(&rte_services[i].num_mapped_cores, 0); - - rte_smp_wmb(); - - return 0; -} - static void set_lcore_state(uint32_t lcore, int32_t state) { @@ -585,6 +568,25 @@ set_lcore_state(uint32_t lcore, int32_t state) lcore_states[lcore].is_service_core = (state == ROLE_SERVICE); } +int32_t rte_service_lcore_reset_all(void) +{ + /* loop over cores, reset all to mask 0 */ + uint32_t i; + for (i = 0; i < RTE_MAX_LCORE; i++) { + if(lcore_states[i].is_service_core) { + lcore_states[i].service_mask = 0; + set_lcore_state(i, ROLE_RTE); + lcore_states[i].runstate = RUNSTATE_STOPPED; + } + } + for (i = 0; i < RTE_SERVICE_NUM_MAX; i++) + rte_atomic32_set(&rte_services[i].num_mapped_cores, 0); + + rte_smp_wmb(); + + return 0; +} + int32_t rte_service_lcore_add(uint32_t lcore) { -- 2.7.4 ^ permalink raw reply [flat|nested] 13+ messages in thread
* [dpdk-dev] [PATCH v2 2/2] service: fix service core launch 2018-01-08 15:58 ` [dpdk-dev] [PATCH v2 " Harry van Haaren @ 2018-01-08 15:58 ` Harry van Haaren 2018-01-09 11:38 ` [dpdk-dev] [PATCH v3 1/2] service: fix del to reset lcore role to rte Harry van Haaren 1 sibling, 0 replies; 13+ messages in thread From: Harry van Haaren @ 2018-01-08 15:58 UTC (permalink / raw) To: dev; +Cc: pbhagavatula, Harry van Haaren This patch fixes a potential bug, which was not consistently showing up in the unit tests. The issue was that the service- lcore being started was not in a "WAIT" state, and hence EAL would return -EBUSY instead of launching the lcore. In order to ensure a core is in a launch-ready state, the application must call rte_eal_wait_lcore, to ensure that the core has completed its previous task, and that EAL is ready to re-launch it. The call to rte_eal_wait_lcore() is explicitly not in the service core function, to make it visible to the application. Requiring an explicit function call ensures the developer sees that a lcore could block in the rte_eal_wait_lcore() function if the core hasn't returned from its previous function. >From a usability perspective, hiding the wait_lcore() inside service cores would cause confusion. This patch adds rte_eal_wait_lcore() calls to the unit tests, to ensure that the lcores for testing functionality are ready to run the test. Fixes: 21698354c832 ("service: introduce service cores concept") +CC stable@dpdk.org Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> --- v2: - Increased delay time, as certain systems could fail intermittently due to the thread not being spawned before delay was over (Pavan) - Added rte_eal_wait_lcore() on service cores to ensure cores are ready state before re-running test with new parameters (Pavan) @Stable maintainers; this is an EXPERIMENTAL tagged API, so I'm not sure what the expectation is in terms of backporting. --- lib/librte_eal/common/include/rte_service.h | 4 +++- test/test/test_service_cores.c | 10 +++++++++- 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/lib/librte_eal/common/include/rte_service.h b/lib/librte_eal/common/include/rte_service.h index 5a76383..95def4c 100644 --- a/lib/librte_eal/common/include/rte_service.h +++ b/lib/librte_eal/common/include/rte_service.h @@ -246,7 +246,9 @@ int32_t rte_service_run_iter_on_app_lcore(uint32_t id, * Start a service core. * * Starting a core makes the core begin polling. Any services assigned to it - * will be run as fast as possible. + * will be run as fast as possible. The application must ensure that the lcore + * is in a launchable state: e.g. call *rte_eal_lcore_wait* on the lcore_id + * before calling this function. * * @retval 0 Success * @retval -EINVAL Failed to start core. The *lcore_id* passed in is not diff --git a/test/test/test_service_cores.c b/test/test/test_service_cores.c index 7d09f5c..2972a80 100644 --- a/test/test/test_service_cores.c +++ b/test/test/test_service_cores.c @@ -320,6 +320,7 @@ service_lcore_en_dis_able(void) /* call remote_launch to verify that app can launch ex-service lcore */ service_remote_launch_flag = 0; + rte_eal_wait_lcore(slcore_id); int ret = rte_eal_remote_launch(service_remote_launch_func, NULL, slcore_id); TEST_ASSERT_EQUAL(0, ret, "Ex-service core remote launch failed."); @@ -334,7 +335,7 @@ static int service_lcore_running_check(void) { uint64_t tick = service_tick; - rte_delay_ms(SERVICE_DELAY * 10); + rte_delay_ms(SERVICE_DELAY * 100); /* if (tick != service_tick) we know the lcore as polled the service */ return tick != service_tick; } @@ -477,6 +478,10 @@ service_threaded_test(int mt_safe) if (!mt_safe) test_params[1] = 1; + /* wait for lcores before start() */ + rte_eal_wait_lcore(slcore_1); + rte_eal_wait_lcore(slcore_2); + rte_service_lcore_start(slcore_1); rte_service_lcore_start(slcore_2); @@ -490,6 +495,8 @@ service_threaded_test(int mt_safe) TEST_ASSERT_EQUAL(0, rte_service_runstate_set(sid, 0), "Failed to stop MT Safe service"); + rte_eal_wait_lcore(slcore_1); + rte_eal_wait_lcore(slcore_2); unregister_all(); /* return the value of the callback pass_test variable to caller */ @@ -583,6 +590,7 @@ service_app_lcore_poll_impl(const int mt_safe) rte_service_runstate_set(id, 1); uint32_t app_core2 = rte_get_next_lcore(slcore_id, 1, 1); + rte_eal_wait_lcore(app_core2); int app_core2_ret = rte_eal_remote_launch(service_run_on_app_core_func, &id, app_core2); -- 2.7.4 ^ permalink raw reply [flat|nested] 13+ messages in thread
* [dpdk-dev] [PATCH v3 1/2] service: fix del to reset lcore role to rte 2018-01-08 15:58 ` [dpdk-dev] [PATCH v2 " Harry van Haaren 2018-01-08 15:58 ` [dpdk-dev] [PATCH v2 2/2] service: fix service core launch Harry van Haaren @ 2018-01-09 11:38 ` Harry van Haaren 2018-01-09 11:38 ` [dpdk-dev] [PATCH v3 2/2] service: fix service core launch Harry van Haaren ` (2 more replies) 1 sibling, 3 replies; 13+ messages in thread From: Harry van Haaren @ 2018-01-09 11:38 UTC (permalink / raw) To: dev; +Cc: pbhagavatula, Harry van Haaren This patch fixes the reset of the service core, that when rte_service_lcore_del() is called, the lcore_role is restored to RTE. This issue was reported as when running the unit tests, an error was thrown that "failed to allocate lcore". Investigating revealed that the state of the service-cores after del() was not allowing a core to be re-used at a later point in time. Fixes: 21698354c832 ("service: introduce service cores concept") +CC stable@dpdk.org Reported-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com> Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> --- v3: - Fix whitespace issue introduced in v2 (Doh :) v2: - Only update state on service core ids (Pavan) @Stable maintainers; this is an EXPERIMENTAL tagged API, so I'm not sure what the expectation is in terms of backporting. --- lib/librte_eal/common/rte_service.c | 36 +++++++++++++++++++----------------- 1 file changed, 19 insertions(+), 17 deletions(-) diff --git a/lib/librte_eal/common/rte_service.c b/lib/librte_eal/common/rte_service.c index 372d0bb..44a988a 100644 --- a/lib/librte_eal/common/rte_service.c +++ b/lib/librte_eal/common/rte_service.c @@ -554,23 +554,6 @@ rte_service_map_lcore_get(uint32_t id, uint32_t lcore) return ret; } -int32_t rte_service_lcore_reset_all(void) -{ - /* loop over cores, reset all to mask 0 */ - uint32_t i; - for (i = 0; i < RTE_MAX_LCORE; i++) { - lcore_states[i].service_mask = 0; - lcore_states[i].is_service_core = 0; - lcore_states[i].runstate = RUNSTATE_STOPPED; - } - for (i = 0; i < RTE_SERVICE_NUM_MAX; i++) - rte_atomic32_set(&rte_services[i].num_mapped_cores, 0); - - rte_smp_wmb(); - - return 0; -} - static void set_lcore_state(uint32_t lcore, int32_t state) { @@ -585,6 +568,25 @@ set_lcore_state(uint32_t lcore, int32_t state) lcore_states[lcore].is_service_core = (state == ROLE_SERVICE); } +int32_t rte_service_lcore_reset_all(void) +{ + /* loop over cores, reset all to mask 0 */ + uint32_t i; + for (i = 0; i < RTE_MAX_LCORE; i++) { + if (lcore_states[i].is_service_core) { + lcore_states[i].service_mask = 0; + set_lcore_state(i, ROLE_RTE); + lcore_states[i].runstate = RUNSTATE_STOPPED; + } + } + for (i = 0; i < RTE_SERVICE_NUM_MAX; i++) + rte_atomic32_set(&rte_services[i].num_mapped_cores, 0); + + rte_smp_wmb(); + + return 0; +} + int32_t rte_service_lcore_add(uint32_t lcore) { -- 2.7.4 ^ permalink raw reply [flat|nested] 13+ messages in thread
* [dpdk-dev] [PATCH v3 2/2] service: fix service core launch 2018-01-09 11:38 ` [dpdk-dev] [PATCH v3 1/2] service: fix del to reset lcore role to rte Harry van Haaren @ 2018-01-09 11:38 ` Harry van Haaren 2018-01-09 12:14 ` [dpdk-dev] [PATCH v3 1/2] service: fix del to reset lcore role to rte Bruce Richardson 2018-01-09 13:37 ` [dpdk-dev] [PATCH v4 1/2] service: fix lcore role after delete Harry van Haaren 2 siblings, 0 replies; 13+ messages in thread From: Harry van Haaren @ 2018-01-09 11:38 UTC (permalink / raw) To: dev; +Cc: pbhagavatula, Harry van Haaren This patch fixes a potential bug, which was not consistently showing up in the unit tests. The issue was that the service- lcore being started was not in a "WAIT" state, and hence EAL would return -EBUSY instead of launching the lcore. In order to ensure a core is in a launch-ready state, the application must call rte_eal_wait_lcore, to ensure that the core has completed its previous task, and that EAL is ready to re-launch it. The call to rte_eal_wait_lcore() is explicitly not in the service core function, to make it visible to the application. Requiring an explicit function call ensures the developer sees that a lcore could block in the rte_eal_wait_lcore() function if the core hasn't returned from its previous function. >From a usability perspective, hiding the wait_lcore() inside service cores would cause confusion. This patch adds rte_eal_wait_lcore() calls to the unit tests, to ensure that the lcores for testing functionality are ready to run the test. Fixes: 21698354c832 ("service: introduce service cores concept") +CC stable@dpdk.org Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> --- v3: no change, only patch 1/2 changed. v2: - Increased delay time, as certain systems could fail intermittently due to the thread not being spawned before delay was over (Pavan) - Added rte_eal_wait_lcore() on service cores to ensure cores are ready state before re-running test with new parameters (Pavan) @Stable maintainers; this is an EXPERIMENTAL tagged API, so I'm not sure what the expectation is in terms of backporting. --- lib/librte_eal/common/include/rte_service.h | 4 +++- test/test/test_service_cores.c | 10 +++++++++- 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/lib/librte_eal/common/include/rte_service.h b/lib/librte_eal/common/include/rte_service.h index 5a76383..95def4c 100644 --- a/lib/librte_eal/common/include/rte_service.h +++ b/lib/librte_eal/common/include/rte_service.h @@ -246,7 +246,9 @@ int32_t rte_service_run_iter_on_app_lcore(uint32_t id, * Start a service core. * * Starting a core makes the core begin polling. Any services assigned to it - * will be run as fast as possible. + * will be run as fast as possible. The application must ensure that the lcore + * is in a launchable state: e.g. call *rte_eal_lcore_wait* on the lcore_id + * before calling this function. * * @retval 0 Success * @retval -EINVAL Failed to start core. The *lcore_id* passed in is not diff --git a/test/test/test_service_cores.c b/test/test/test_service_cores.c index 7d09f5c..2972a80 100644 --- a/test/test/test_service_cores.c +++ b/test/test/test_service_cores.c @@ -320,6 +320,7 @@ service_lcore_en_dis_able(void) /* call remote_launch to verify that app can launch ex-service lcore */ service_remote_launch_flag = 0; + rte_eal_wait_lcore(slcore_id); int ret = rte_eal_remote_launch(service_remote_launch_func, NULL, slcore_id); TEST_ASSERT_EQUAL(0, ret, "Ex-service core remote launch failed."); @@ -334,7 +335,7 @@ static int service_lcore_running_check(void) { uint64_t tick = service_tick; - rte_delay_ms(SERVICE_DELAY * 10); + rte_delay_ms(SERVICE_DELAY * 100); /* if (tick != service_tick) we know the lcore as polled the service */ return tick != service_tick; } @@ -477,6 +478,10 @@ service_threaded_test(int mt_safe) if (!mt_safe) test_params[1] = 1; + /* wait for lcores before start() */ + rte_eal_wait_lcore(slcore_1); + rte_eal_wait_lcore(slcore_2); + rte_service_lcore_start(slcore_1); rte_service_lcore_start(slcore_2); @@ -490,6 +495,8 @@ service_threaded_test(int mt_safe) TEST_ASSERT_EQUAL(0, rte_service_runstate_set(sid, 0), "Failed to stop MT Safe service"); + rte_eal_wait_lcore(slcore_1); + rte_eal_wait_lcore(slcore_2); unregister_all(); /* return the value of the callback pass_test variable to caller */ @@ -583,6 +590,7 @@ service_app_lcore_poll_impl(const int mt_safe) rte_service_runstate_set(id, 1); uint32_t app_core2 = rte_get_next_lcore(slcore_id, 1, 1); + rte_eal_wait_lcore(app_core2); int app_core2_ret = rte_eal_remote_launch(service_run_on_app_core_func, &id, app_core2); -- 2.7.4 ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [dpdk-dev] [PATCH v3 1/2] service: fix del to reset lcore role to rte 2018-01-09 11:38 ` [dpdk-dev] [PATCH v3 1/2] service: fix del to reset lcore role to rte Harry van Haaren 2018-01-09 11:38 ` [dpdk-dev] [PATCH v3 2/2] service: fix service core launch Harry van Haaren @ 2018-01-09 12:14 ` Bruce Richardson 2018-01-09 13:37 ` [dpdk-dev] [PATCH v4 1/2] service: fix lcore role after delete Harry van Haaren 2 siblings, 0 replies; 13+ messages in thread From: Bruce Richardson @ 2018-01-09 12:14 UTC (permalink / raw) To: Harry van Haaren; +Cc: dev, pbhagavatula On Tue, Jan 09, 2018 at 11:38:04AM +0000, Harry van Haaren wrote: > This patch fixes the reset of the service core, > that when rte_service_lcore_del() is called, the > lcore_role is restored to RTE. > Title seems awkward, how about "fix lcore role after delete"? ^ permalink raw reply [flat|nested] 13+ messages in thread
* [dpdk-dev] [PATCH v4 1/2] service: fix lcore role after delete 2018-01-09 11:38 ` [dpdk-dev] [PATCH v3 1/2] service: fix del to reset lcore role to rte Harry van Haaren 2018-01-09 11:38 ` [dpdk-dev] [PATCH v3 2/2] service: fix service core launch Harry van Haaren 2018-01-09 12:14 ` [dpdk-dev] [PATCH v3 1/2] service: fix del to reset lcore role to rte Bruce Richardson @ 2018-01-09 13:37 ` Harry van Haaren 2018-01-09 13:37 ` [dpdk-dev] [PATCH v4 2/2] service: fix service core launch Harry van Haaren 2018-01-10 10:23 ` [dpdk-dev] [PATCH v4 1/2] service: fix lcore role after delete Pavan Nikhilesh 2 siblings, 2 replies; 13+ messages in thread From: Harry van Haaren @ 2018-01-09 13:37 UTC (permalink / raw) To: dev; +Cc: pbhagavatula, bruce.richardson, Harry van Haaren This patch fixes the reset of the service core, that when rte_service_lcore_del() is called, the lcore_role is restored to RTE. This issue was reported as when running the unit tests, an error was thrown that "failed to allocate lcore". Investigating revealed that the state of the service-cores after del() was not allowing a core to be re-used at a later point in time. Fixes: 21698354c832 ("service: introduce service cores concept") +CC stable@dpdk.org Reported-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com> Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> --- v4: - Improve commit title (Bruce) v3: - Fix whitespace issue introduced in v2 (Doh :) v2: - Only update state on service core ids (Pavan) @Stable maintainers; this is an EXPERIMENTAL tagged API, so I'm not sure what the expectation is in terms of backporting. --- lib/librte_eal/common/rte_service.c | 36 +++++++++++++++++++----------------- 1 file changed, 19 insertions(+), 17 deletions(-) diff --git a/lib/librte_eal/common/rte_service.c b/lib/librte_eal/common/rte_service.c index 372d0bb..44a988a 100644 --- a/lib/librte_eal/common/rte_service.c +++ b/lib/librte_eal/common/rte_service.c @@ -554,23 +554,6 @@ rte_service_map_lcore_get(uint32_t id, uint32_t lcore) return ret; } -int32_t rte_service_lcore_reset_all(void) -{ - /* loop over cores, reset all to mask 0 */ - uint32_t i; - for (i = 0; i < RTE_MAX_LCORE; i++) { - lcore_states[i].service_mask = 0; - lcore_states[i].is_service_core = 0; - lcore_states[i].runstate = RUNSTATE_STOPPED; - } - for (i = 0; i < RTE_SERVICE_NUM_MAX; i++) - rte_atomic32_set(&rte_services[i].num_mapped_cores, 0); - - rte_smp_wmb(); - - return 0; -} - static void set_lcore_state(uint32_t lcore, int32_t state) { @@ -585,6 +568,25 @@ set_lcore_state(uint32_t lcore, int32_t state) lcore_states[lcore].is_service_core = (state == ROLE_SERVICE); } +int32_t rte_service_lcore_reset_all(void) +{ + /* loop over cores, reset all to mask 0 */ + uint32_t i; + for (i = 0; i < RTE_MAX_LCORE; i++) { + if (lcore_states[i].is_service_core) { + lcore_states[i].service_mask = 0; + set_lcore_state(i, ROLE_RTE); + lcore_states[i].runstate = RUNSTATE_STOPPED; + } + } + for (i = 0; i < RTE_SERVICE_NUM_MAX; i++) + rte_atomic32_set(&rte_services[i].num_mapped_cores, 0); + + rte_smp_wmb(); + + return 0; +} + int32_t rte_service_lcore_add(uint32_t lcore) { -- 2.7.4 ^ permalink raw reply [flat|nested] 13+ messages in thread
* [dpdk-dev] [PATCH v4 2/2] service: fix service core launch 2018-01-09 13:37 ` [dpdk-dev] [PATCH v4 1/2] service: fix lcore role after delete Harry van Haaren @ 2018-01-09 13:37 ` Harry van Haaren 2018-01-10 10:23 ` [dpdk-dev] [PATCH v4 1/2] service: fix lcore role after delete Pavan Nikhilesh 1 sibling, 0 replies; 13+ messages in thread From: Harry van Haaren @ 2018-01-09 13:37 UTC (permalink / raw) To: dev; +Cc: pbhagavatula, bruce.richardson, Harry van Haaren This patch fixes a potential bug, which was not consistently showing up in the unit tests. The issue was that the service- lcore being started was not in a "WAIT" state, and hence EAL would return -EBUSY instead of launching the lcore. In order to ensure a core is in a launch-ready state, the application must call rte_eal_wait_lcore, to ensure that the core has completed its previous task, and that EAL is ready to re-launch it. The call to rte_eal_wait_lcore() is explicitly not in the service core function, to make it visible to the application. Requiring an explicit function call ensures the developer sees that a lcore could block in the rte_eal_wait_lcore() function if the core hasn't returned from its previous function. >From a usability perspective, hiding the wait_lcore() inside service cores would cause confusion. This patch adds rte_eal_wait_lcore() calls to the unit tests, to ensure that the lcores for testing functionality are ready to run the test. Fixes: 21698354c832 ("service: introduce service cores concept") +CC stable@dpdk.org Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> --- v3, v4: - No changes v2: - Increased delay time, as certain systems could fail intermittently due to the thread not being spawned before delay was over (Pavan) - Added rte_eal_wait_lcore() on service cores to ensure cores are ready state before re-running test with new parameters (Pavan) @Stable maintainers; this is an EXPERIMENTAL tagged API, so I'm not sure what the expectation is in terms of backporting. --- lib/librte_eal/common/include/rte_service.h | 4 +++- test/test/test_service_cores.c | 10 +++++++++- 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/lib/librte_eal/common/include/rte_service.h b/lib/librte_eal/common/include/rte_service.h index 5a76383..95def4c 100644 --- a/lib/librte_eal/common/include/rte_service.h +++ b/lib/librte_eal/common/include/rte_service.h @@ -246,7 +246,9 @@ int32_t rte_service_run_iter_on_app_lcore(uint32_t id, * Start a service core. * * Starting a core makes the core begin polling. Any services assigned to it - * will be run as fast as possible. + * will be run as fast as possible. The application must ensure that the lcore + * is in a launchable state: e.g. call *rte_eal_lcore_wait* on the lcore_id + * before calling this function. * * @retval 0 Success * @retval -EINVAL Failed to start core. The *lcore_id* passed in is not diff --git a/test/test/test_service_cores.c b/test/test/test_service_cores.c index 7d09f5c..2972a80 100644 --- a/test/test/test_service_cores.c +++ b/test/test/test_service_cores.c @@ -320,6 +320,7 @@ service_lcore_en_dis_able(void) /* call remote_launch to verify that app can launch ex-service lcore */ service_remote_launch_flag = 0; + rte_eal_wait_lcore(slcore_id); int ret = rte_eal_remote_launch(service_remote_launch_func, NULL, slcore_id); TEST_ASSERT_EQUAL(0, ret, "Ex-service core remote launch failed."); @@ -334,7 +335,7 @@ static int service_lcore_running_check(void) { uint64_t tick = service_tick; - rte_delay_ms(SERVICE_DELAY * 10); + rte_delay_ms(SERVICE_DELAY * 100); /* if (tick != service_tick) we know the lcore as polled the service */ return tick != service_tick; } @@ -477,6 +478,10 @@ service_threaded_test(int mt_safe) if (!mt_safe) test_params[1] = 1; + /* wait for lcores before start() */ + rte_eal_wait_lcore(slcore_1); + rte_eal_wait_lcore(slcore_2); + rte_service_lcore_start(slcore_1); rte_service_lcore_start(slcore_2); @@ -490,6 +495,8 @@ service_threaded_test(int mt_safe) TEST_ASSERT_EQUAL(0, rte_service_runstate_set(sid, 0), "Failed to stop MT Safe service"); + rte_eal_wait_lcore(slcore_1); + rte_eal_wait_lcore(slcore_2); unregister_all(); /* return the value of the callback pass_test variable to caller */ @@ -583,6 +590,7 @@ service_app_lcore_poll_impl(const int mt_safe) rte_service_runstate_set(id, 1); uint32_t app_core2 = rte_get_next_lcore(slcore_id, 1, 1); + rte_eal_wait_lcore(app_core2); int app_core2_ret = rte_eal_remote_launch(service_run_on_app_core_func, &id, app_core2); -- 2.7.4 ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [dpdk-dev] [PATCH v4 1/2] service: fix lcore role after delete 2018-01-09 13:37 ` [dpdk-dev] [PATCH v4 1/2] service: fix lcore role after delete Harry van Haaren 2018-01-09 13:37 ` [dpdk-dev] [PATCH v4 2/2] service: fix service core launch Harry van Haaren @ 2018-01-10 10:23 ` Pavan Nikhilesh 2018-01-11 22:30 ` Thomas Monjalon 1 sibling, 1 reply; 13+ messages in thread From: Pavan Nikhilesh @ 2018-01-10 10:23 UTC (permalink / raw) To: Harry van Haaren, bruce.richardson; +Cc: dev On Tue, Jan 09, 2018 at 01:37:40PM +0000, Harry van Haaren wrote: > This patch fixes the reset of the service core, > that when rte_service_lcore_del() is called, the > lcore_role is restored to RTE. > > This issue was reported as when running the unit tests, an > error was thrown that "failed to allocate lcore". Investigating > revealed that the state of the service-cores after del() was > not allowing a core to be re-used at a later point in time. > > Fixes: 21698354c832 ("service: introduce service cores concept") > +CC stable@dpdk.org > > Reported-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com> > Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> > > --- > > v4: > - Improve commit title (Bruce) > > v3: > - Fix whitespace issue introduced in v2 (Doh :) > > v2: > - Only update state on service core ids (Pavan) > > @Stable maintainers; this is an EXPERIMENTAL tagged API, so I'm > not sure what the expectation is in terms of backporting. > --- > lib/librte_eal/common/rte_service.c | 36 +++++++++++++++++++----------------- > 1 file changed, 19 insertions(+), 17 deletions(-) > > diff --git a/lib/librte_eal/common/rte_service.c b/lib/librte_eal/common/rte_service.c > index 372d0bb..44a988a 100644 > --- a/lib/librte_eal/common/rte_service.c > +++ b/lib/librte_eal/common/rte_service.c > @@ -554,23 +554,6 @@ rte_service_map_lcore_get(uint32_t id, uint32_t lcore) > return ret; > } > > -int32_t rte_service_lcore_reset_all(void) > -{ > - /* loop over cores, reset all to mask 0 */ > - uint32_t i; > - for (i = 0; i < RTE_MAX_LCORE; i++) { > - lcore_states[i].service_mask = 0; > - lcore_states[i].is_service_core = 0; > - lcore_states[i].runstate = RUNSTATE_STOPPED; > - } > - for (i = 0; i < RTE_SERVICE_NUM_MAX; i++) > - rte_atomic32_set(&rte_services[i].num_mapped_cores, 0); > - > - rte_smp_wmb(); > - > - return 0; > -} > - > static void > set_lcore_state(uint32_t lcore, int32_t state) > { > @@ -585,6 +568,25 @@ set_lcore_state(uint32_t lcore, int32_t state) > lcore_states[lcore].is_service_core = (state == ROLE_SERVICE); > } > > +int32_t rte_service_lcore_reset_all(void) > +{ > + /* loop over cores, reset all to mask 0 */ > + uint32_t i; > + for (i = 0; i < RTE_MAX_LCORE; i++) { > + if (lcore_states[i].is_service_core) { > + lcore_states[i].service_mask = 0; > + set_lcore_state(i, ROLE_RTE); > + lcore_states[i].runstate = RUNSTATE_STOPPED; > + } > + } > + for (i = 0; i < RTE_SERVICE_NUM_MAX; i++) > + rte_atomic32_set(&rte_services[i].num_mapped_cores, 0); > + > + rte_smp_wmb(); > + > + return 0; > +} > + > int32_t > rte_service_lcore_add(uint32_t lcore) > { > -- > 2.7.4 > LGTM Series-Acked-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [dpdk-dev] [PATCH v4 1/2] service: fix lcore role after delete 2018-01-10 10:23 ` [dpdk-dev] [PATCH v4 1/2] service: fix lcore role after delete Pavan Nikhilesh @ 2018-01-11 22:30 ` Thomas Monjalon 0 siblings, 0 replies; 13+ messages in thread From: Thomas Monjalon @ 2018-01-11 22:30 UTC (permalink / raw) To: Harry van Haaren; +Cc: dev, Pavan Nikhilesh, bruce.richardson 10/01/2018 11:23, Pavan Nikhilesh: > On Tue, Jan 09, 2018 at 01:37:40PM +0000, Harry van Haaren wrote: > > This patch fixes the reset of the service core, > > that when rte_service_lcore_del() is called, the > > lcore_role is restored to RTE. > > > > This issue was reported as when running the unit tests, an > > error was thrown that "failed to allocate lcore". Investigating > > revealed that the state of the service-cores after del() was > > not allowing a core to be re-used at a later point in time. > > > > Fixes: 21698354c832 ("service: introduce service cores concept") > > +CC stable@dpdk.org The canonical format is: Cc: stable@dpdk.org > > Reported-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com> > > Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> > > Series-Acked-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com> Applied, thanks ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2018-01-11 22:31 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-12-20 11:21 [dpdk-dev] [PATCH 1/2] service: fix del to reset lcore role to rte Harry van Haaren 2017-12-20 11:21 ` [dpdk-dev] [PATCH 2/2] service: fix service core launch Harry van Haaren 2018-01-04 15:30 ` Pavan Nikhilesh 2018-01-04 15:20 ` [dpdk-dev] [PATCH 1/2] service: fix del to reset lcore role to rte Pavan Nikhilesh 2018-01-08 15:58 ` [dpdk-dev] [PATCH v2 " Harry van Haaren 2018-01-08 15:58 ` [dpdk-dev] [PATCH v2 2/2] service: fix service core launch Harry van Haaren 2018-01-09 11:38 ` [dpdk-dev] [PATCH v3 1/2] service: fix del to reset lcore role to rte Harry van Haaren 2018-01-09 11:38 ` [dpdk-dev] [PATCH v3 2/2] service: fix service core launch Harry van Haaren 2018-01-09 12:14 ` [dpdk-dev] [PATCH v3 1/2] service: fix del to reset lcore role to rte Bruce Richardson 2018-01-09 13:37 ` [dpdk-dev] [PATCH v4 1/2] service: fix lcore role after delete Harry van Haaren 2018-01-09 13:37 ` [dpdk-dev] [PATCH v4 2/2] service: fix service core launch Harry van Haaren 2018-01-10 10:23 ` [dpdk-dev] [PATCH v4 1/2] service: fix lcore role after delete Pavan Nikhilesh 2018-01-11 22:30 ` Thomas Monjalon
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).