* [dpdk-stable] [PATCH] eal: fix possible UB on creation of ctrl thread
@ 2021-03-24 13:04 Luc Pelletier
2021-03-25 11:27 ` [dpdk-stable] [PATCH v2] eal: fix race in ctrl thread creation Olivier Matz
2021-04-06 16:15 ` [dpdk-stable] [PATCH v3] " Luc Pelletier
0 siblings, 2 replies; 24+ messages in thread
From: Luc Pelletier @ 2021-03-24 13:04 UTC (permalink / raw)
To: olivier.matz, jianfeng.tan; +Cc: dev, Luc Pelletier, stable
The creation of control threads uses a pthread barrier for
synchronization. This patch fixes a race condition where the pthread
barrier could get destroyed while one of the threads has not yet
returned from the pthread_barrier_wait function, which could result in
undefined behaviour.
Fixes: 3a0d465d4c53 ("eal: fix use-after-free on control thread creation")
Cc: jianfeng.tan@intel.com
Cc: stable@dpdk.org
Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>
---
lib/librte_eal/common/eal_common_thread.c | 20 +++++++++++++++++---
1 file changed, 17 insertions(+), 3 deletions(-)
diff --git a/lib/librte_eal/common/eal_common_thread.c b/lib/librte_eal/common/eal_common_thread.c
index 73a055902..bd9a9fdbc 100644
--- a/lib/librte_eal/common/eal_common_thread.c
+++ b/lib/librte_eal/common/eal_common_thread.c
@@ -170,6 +170,7 @@ struct rte_thread_ctrl_params {
void *(*start_routine)(void *);
void *arg;
pthread_barrier_t configured;
+ bool barrier_in_use;
};
static void *ctrl_thread_init(void *arg)
@@ -186,9 +187,13 @@ static void *ctrl_thread_init(void *arg)
ret = pthread_barrier_wait(¶ms->configured);
if (ret == PTHREAD_BARRIER_SERIAL_THREAD) {
+ while (__atomic_load_n(¶ms->barrier_in_use,
+ __ATOMIC_ACQUIRE))
+ sched_yield();
pthread_barrier_destroy(¶ms->configured);
free(params);
- }
+ } else
+ __atomic_store_n(¶ms->barrier_in_use, 0, __ATOMIC_RELEASE);
return start_routine(routine_arg);
}
@@ -210,6 +215,7 @@ rte_ctrl_thread_create(pthread_t *thread, const char *name,
params->start_routine = start_routine;
params->arg = arg;
+ params->barrier_in_use = 1;
pthread_barrier_init(¶ms->configured, NULL, 2);
@@ -232,18 +238,26 @@ rte_ctrl_thread_create(pthread_t *thread, const char *name,
ret = pthread_barrier_wait(¶ms->configured);
if (ret == PTHREAD_BARRIER_SERIAL_THREAD) {
+ while (__atomic_load_n(¶ms->barrier_in_use,
+ __ATOMIC_ACQUIRE))
+ sched_yield();
pthread_barrier_destroy(¶ms->configured);
free(params);
- }
+ } else
+ __atomic_store_n(¶ms->barrier_in_use, 0, __ATOMIC_RELEASE);
return 0;
fail:
if (PTHREAD_BARRIER_SERIAL_THREAD ==
pthread_barrier_wait(¶ms->configured)) {
+ while (__atomic_load_n(¶ms->barrier_in_use,
+ __ATOMIC_ACQUIRE))
+ sched_yield();
pthread_barrier_destroy(¶ms->configured);
free(params);
- }
+ } else
+ __atomic_store_n(¶ms->barrier_in_use, 0, __ATOMIC_RELEASE);
pthread_cancel(*thread);
pthread_join(*thread, NULL);
return -ret;
--
2.25.1
^ permalink raw reply [flat|nested] 24+ messages in thread
* [dpdk-stable] [PATCH v2] eal: fix race in ctrl thread creation
2021-03-24 13:04 [dpdk-stable] [PATCH] eal: fix possible UB on creation of ctrl thread Luc Pelletier
@ 2021-03-25 11:27 ` Olivier Matz
2021-03-25 14:42 ` Luc Pelletier
2021-04-02 4:34 ` [dpdk-stable] [dpdk-dev] " Honnappa Nagarahalli
2021-04-06 16:15 ` [dpdk-stable] [PATCH v3] " Luc Pelletier
1 sibling, 2 replies; 24+ messages in thread
From: Olivier Matz @ 2021-03-25 11:27 UTC (permalink / raw)
To: lucp.at.work; +Cc: dev, jianfeng.tan, david.marchand, stable
As reported by Luc, there is a race where the barrier is destroyed by
one thread, while the other thread did not yet leave
pthread_barrier_wait.
This patch fixes the race condition by adding an atomic counter to
ensure that the barrier is destroyed only it is not used by any thread.
Fixes: 3a0d465d4c53 ("eal: fix use-after-free on control thread creation")
Cc: jianfeng.tan@intel.com
Cc: stable@dpdk.org
Reported-by: Luc Pelletier <lucp.at.work@gmail.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
Hi Luc,
Thank you for reporting this problem and submitting the patch.
I think the issue can be fixed without any loop, like in this
patch. What do you think?
Regards,
Olivier
lib/librte_eal/common/eal_common_thread.c | 38 +++++++++++++----------
1 file changed, 21 insertions(+), 17 deletions(-)
diff --git a/lib/librte_eal/common/eal_common_thread.c b/lib/librte_eal/common/eal_common_thread.c
index 73a055902a..891f825e87 100644
--- a/lib/librte_eal/common/eal_common_thread.c
+++ b/lib/librte_eal/common/eal_common_thread.c
@@ -170,11 +170,11 @@ struct rte_thread_ctrl_params {
void *(*start_routine)(void *);
void *arg;
pthread_barrier_t configured;
+ unsigned int barrier_refcnt;
};
static void *ctrl_thread_init(void *arg)
{
- int ret;
struct internal_config *internal_conf =
eal_get_internal_configuration();
rte_cpuset_t *cpuset = &internal_conf->ctrl_cpuset;
@@ -184,8 +184,9 @@ static void *ctrl_thread_init(void *arg)
__rte_thread_init(rte_lcore_id(), cpuset);
- ret = pthread_barrier_wait(¶ms->configured);
- if (ret == PTHREAD_BARRIER_SERIAL_THREAD) {
+ pthread_barrier_wait(¶ms->configured);
+ if (__atomic_sub_fetch(¶ms->barrier_refcnt, 1,
+ __ATOMIC_ACQ_REL) == 0) {
pthread_barrier_destroy(¶ms->configured);
free(params);
}
@@ -210,15 +211,17 @@ rte_ctrl_thread_create(pthread_t *thread, const char *name,
params->start_routine = start_routine;
params->arg = arg;
-
- pthread_barrier_init(¶ms->configured, NULL, 2);
-
- ret = pthread_create(thread, attr, ctrl_thread_init, (void *)params);
+ __atomic_store_n(¶ms->barrier_refcnt, 2, __ATOMIC_RELEASE);
+ ret = pthread_barrier_init(¶ms->configured, NULL, 2);
if (ret != 0) {
free(params);
return -ret;
}
+ ret = pthread_create(thread, attr, ctrl_thread_init, (void *)params);
+ if (ret != 0)
+ goto fail;
+
if (name != NULL) {
ret = rte_thread_setname(*thread, name);
if (ret < 0)
@@ -227,25 +230,26 @@ rte_ctrl_thread_create(pthread_t *thread, const char *name,
}
ret = pthread_setaffinity_np(*thread, sizeof(*cpuset), cpuset);
- if (ret)
- goto fail;
+ if (ret != 0)
+ goto fail_cancel;
- ret = pthread_barrier_wait(¶ms->configured);
- if (ret == PTHREAD_BARRIER_SERIAL_THREAD) {
+ pthread_barrier_wait(¶ms->configured);
+ if (__atomic_sub_fetch(¶ms->barrier_refcnt, 1,
+ __ATOMIC_ACQ_REL) == 0) {
pthread_barrier_destroy(¶ms->configured);
free(params);
}
return 0;
-fail:
- if (PTHREAD_BARRIER_SERIAL_THREAD ==
- pthread_barrier_wait(¶ms->configured)) {
- pthread_barrier_destroy(¶ms->configured);
- free(params);
- }
+fail_cancel:
pthread_cancel(*thread);
pthread_join(*thread, NULL);
+
+fail:
+ pthread_barrier_destroy(¶ms->configured);
+ free(params);
+
return -ret;
}
--
2.29.2
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [dpdk-stable] [PATCH v2] eal: fix race in ctrl thread creation
2021-03-25 11:27 ` [dpdk-stable] [PATCH v2] eal: fix race in ctrl thread creation Olivier Matz
@ 2021-03-25 14:42 ` Luc Pelletier
2021-04-02 4:34 ` [dpdk-stable] [dpdk-dev] " Honnappa Nagarahalli
1 sibling, 0 replies; 24+ messages in thread
From: Luc Pelletier @ 2021-03-25 14:42 UTC (permalink / raw)
To: Olivier Matz; +Cc: dev, jianfeng.tan, david.marchand, stable
Hi Olivier,
> Thank you for reporting this problem and submitting the patch.
> I think the issue can be fixed without any loop, like in this
> patch. What do you think?
I think getting rid of the loop is an excellent idea. Good thinking.
Your version is much cleaner.
> + __atomic_store_n(¶ms->barrier_refcnt, 2, __ATOMIC_RELEASE);
I don't mean to nitpick but I don't think you need to use
__atomic_store_n to initialize the refcnt. Either way is fine of
course :)
Thanks.
Le jeu. 25 mars 2021 à 07:27, Olivier Matz <olivier.matz@6wind.com> a écrit :
>
> As reported by Luc, there is a race where the barrier is destroyed by
> one thread, while the other thread did not yet leave
> pthread_barrier_wait.
>
> This patch fixes the race condition by adding an atomic counter to
> ensure that the barrier is destroyed only it is not used by any thread.
>
> Fixes: 3a0d465d4c53 ("eal: fix use-after-free on control thread creation")
> Cc: jianfeng.tan@intel.com
> Cc: stable@dpdk.org
>
> Reported-by: Luc Pelletier <lucp.at.work@gmail.com>
> Signed-off-by: David Marchand <david.marchand@redhat.com>
> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> ---
>
> Hi Luc,
>
> Thank you for reporting this problem and submitting the patch.
> I think the issue can be fixed without any loop, like in this
> patch. What do you think?
>
> Regards,
> Olivier
>
>
> lib/librte_eal/common/eal_common_thread.c | 38 +++++++++++++----------
> 1 file changed, 21 insertions(+), 17 deletions(-)
>
> diff --git a/lib/librte_eal/common/eal_common_thread.c b/lib/librte_eal/common/eal_common_thread.c
> index 73a055902a..891f825e87 100644
> --- a/lib/librte_eal/common/eal_common_thread.c
> +++ b/lib/librte_eal/common/eal_common_thread.c
> @@ -170,11 +170,11 @@ struct rte_thread_ctrl_params {
> void *(*start_routine)(void *);
> void *arg;
> pthread_barrier_t configured;
> + unsigned int barrier_refcnt;
> };
>
> static void *ctrl_thread_init(void *arg)
> {
> - int ret;
> struct internal_config *internal_conf =
> eal_get_internal_configuration();
> rte_cpuset_t *cpuset = &internal_conf->ctrl_cpuset;
> @@ -184,8 +184,9 @@ static void *ctrl_thread_init(void *arg)
>
> __rte_thread_init(rte_lcore_id(), cpuset);
>
> - ret = pthread_barrier_wait(¶ms->configured);
> - if (ret == PTHREAD_BARRIER_SERIAL_THREAD) {
> + pthread_barrier_wait(¶ms->configured);
> + if (__atomic_sub_fetch(¶ms->barrier_refcnt, 1,
> + __ATOMIC_ACQ_REL) == 0) {
> pthread_barrier_destroy(¶ms->configured);
> free(params);
> }
> @@ -210,15 +211,17 @@ rte_ctrl_thread_create(pthread_t *thread, const char *name,
>
> params->start_routine = start_routine;
> params->arg = arg;
> -
> - pthread_barrier_init(¶ms->configured, NULL, 2);
> -
> - ret = pthread_create(thread, attr, ctrl_thread_init, (void *)params);
> + __atomic_store_n(¶ms->barrier_refcnt, 2, __ATOMIC_RELEASE);
> + ret = pthread_barrier_init(¶ms->configured, NULL, 2);
> if (ret != 0) {
> free(params);
> return -ret;
> }
>
> + ret = pthread_create(thread, attr, ctrl_thread_init, (void *)params);
> + if (ret != 0)
> + goto fail;
> +
> if (name != NULL) {
> ret = rte_thread_setname(*thread, name);
> if (ret < 0)
> @@ -227,25 +230,26 @@ rte_ctrl_thread_create(pthread_t *thread, const char *name,
> }
>
> ret = pthread_setaffinity_np(*thread, sizeof(*cpuset), cpuset);
> - if (ret)
> - goto fail;
> + if (ret != 0)
> + goto fail_cancel;
>
> - ret = pthread_barrier_wait(¶ms->configured);
> - if (ret == PTHREAD_BARRIER_SERIAL_THREAD) {
> + pthread_barrier_wait(¶ms->configured);
> + if (__atomic_sub_fetch(¶ms->barrier_refcnt, 1,
> + __ATOMIC_ACQ_REL) == 0) {
> pthread_barrier_destroy(¶ms->configured);
> free(params);
> }
>
> return 0;
>
> -fail:
> - if (PTHREAD_BARRIER_SERIAL_THREAD ==
> - pthread_barrier_wait(¶ms->configured)) {
> - pthread_barrier_destroy(¶ms->configured);
> - free(params);
> - }
> +fail_cancel:
> pthread_cancel(*thread);
> pthread_join(*thread, NULL);
> +
> +fail:
> + pthread_barrier_destroy(¶ms->configured);
> + free(params);
> +
> return -ret;
> }
>
> --
> 2.29.2
>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [dpdk-stable] [dpdk-dev] [PATCH v2] eal: fix race in ctrl thread creation
2021-03-25 11:27 ` [dpdk-stable] [PATCH v2] eal: fix race in ctrl thread creation Olivier Matz
2021-03-25 14:42 ` Luc Pelletier
@ 2021-04-02 4:34 ` Honnappa Nagarahalli
1 sibling, 0 replies; 24+ messages in thread
From: Honnappa Nagarahalli @ 2021-04-02 4:34 UTC (permalink / raw)
To: Olivier Matz, lucp.at.work
Cc: dev, jianfeng.tan, david.marchand, stable, nd, Honnappa Nagarahalli, nd
<snip>
>
> As reported by Luc, there is a race where the barrier is destroyed by one
> thread, while the other thread did not yet leave pthread_barrier_wait.
Please correct me if I am wrong. We are using the pthread_barrier to
1) know when to free 'params'.
2) set the thread affinity before the thread starts running its function
I think this patch is fine. But, it can be simplified.
>
> This patch fixes the race condition by adding an atomic counter to ensure
> that the barrier is destroyed only it is not used by any thread.
>
> Fixes: 3a0d465d4c53 ("eal: fix use-after-free on control thread creation")
> Cc: jianfeng.tan@intel.com
> Cc: stable@dpdk.org
>
> Reported-by: Luc Pelletier <lucp.at.work@gmail.com>
> Signed-off-by: David Marchand <david.marchand@redhat.com>
> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> ---
>
> Hi Luc,
>
> Thank you for reporting this problem and submitting the patch.
> I think the issue can be fixed without any loop, like in this patch. What do
> you think?
>
> Regards,
> Olivier
>
>
> lib/librte_eal/common/eal_common_thread.c | 38 +++++++++++++----------
> 1 file changed, 21 insertions(+), 17 deletions(-)
>
> diff --git a/lib/librte_eal/common/eal_common_thread.c
> b/lib/librte_eal/common/eal_common_thread.c
> index 73a055902a..891f825e87 100644
> --- a/lib/librte_eal/common/eal_common_thread.c
> +++ b/lib/librte_eal/common/eal_common_thread.c
> @@ -170,11 +170,11 @@ struct rte_thread_ctrl_params {
> void *(*start_routine)(void *);
> void *arg;
> pthread_barrier_t configured;
We can get rid of the barrier variable. The 'barrier_refcnt' is enough to know when to free the memory.
Setting the thread affinity can be moved to 'ctrl_thread_init' function.
> + unsigned int barrier_refcnt;
> };
>
> static void *ctrl_thread_init(void *arg) {
> - int ret;
> struct internal_config *internal_conf =
> eal_get_internal_configuration();
> rte_cpuset_t *cpuset = &internal_conf->ctrl_cpuset; @@ -184,8
> +184,9 @@ static void *ctrl_thread_init(void *arg)
>
> __rte_thread_init(rte_lcore_id(), cpuset);
>
> - ret = pthread_barrier_wait(¶ms->configured);
> - if (ret == PTHREAD_BARRIER_SERIAL_THREAD) {
> + pthread_barrier_wait(¶ms->configured);
> + if (__atomic_sub_fetch(¶ms->barrier_refcnt, 1,
> + __ATOMIC_ACQ_REL) == 0) {
> pthread_barrier_destroy(¶ms->configured);
> free(params);
> }
> @@ -210,15 +211,17 @@ rte_ctrl_thread_create(pthread_t *thread, const
> char *name,
>
> params->start_routine = start_routine;
> params->arg = arg;
> -
> - pthread_barrier_init(¶ms->configured, NULL, 2);
> -
> - ret = pthread_create(thread, attr, ctrl_thread_init, (void *)params);
> + __atomic_store_n(¶ms->barrier_refcnt, 2,
> __ATOMIC_RELEASE);
We can use __ATOMIC_RELAXED or just a simple assignment as pthread_create ensures that all the earlier memory writes are visible to the 'ctrl_thread_init'.
> + ret = pthread_barrier_init(¶ms->configured, NULL, 2);
> if (ret != 0) {
> free(params);
> return -ret;
> }
>
> + ret = pthread_create(thread, attr, ctrl_thread_init, (void *)params);
> + if (ret != 0)
> + goto fail;
> +
> if (name != NULL) {
> ret = rte_thread_setname(*thread, name);
> if (ret < 0)
> @@ -227,25 +230,26 @@ rte_ctrl_thread_create(pthread_t *thread, const
> char *name,
> }
>
> ret = pthread_setaffinity_np(*thread, sizeof(*cpuset), cpuset);
I think this can be moved to 'ctrl_thread_init'.
> - if (ret)
> - goto fail;
> + if (ret != 0)
> + goto fail_cancel;
>
> - ret = pthread_barrier_wait(¶ms->configured);
> - if (ret == PTHREAD_BARRIER_SERIAL_THREAD) {
> + pthread_barrier_wait(¶ms->configured);
> + if (__atomic_sub_fetch(¶ms->barrier_refcnt, 1,
> + __ATOMIC_ACQ_REL) == 0) {
> pthread_barrier_destroy(¶ms->configured);
> free(params);
> }
>
> return 0;
>
> -fail:
> - if (PTHREAD_BARRIER_SERIAL_THREAD ==
> - pthread_barrier_wait(¶ms->configured)) {
> - pthread_barrier_destroy(¶ms->configured);
> - free(params);
> - }
> +fail_cancel:
> pthread_cancel(*thread);
> pthread_join(*thread, NULL);
> +
> +fail:
> + pthread_barrier_destroy(¶ms->configured);
> + free(params);
> +
> return -ret;
> }
>
> --
> 2.29.2
^ permalink raw reply [flat|nested] 24+ messages in thread
* [dpdk-stable] [PATCH v3] eal: fix race in ctrl thread creation
2021-03-24 13:04 [dpdk-stable] [PATCH] eal: fix possible UB on creation of ctrl thread Luc Pelletier
2021-03-25 11:27 ` [dpdk-stable] [PATCH v2] eal: fix race in ctrl thread creation Olivier Matz
@ 2021-04-06 16:15 ` Luc Pelletier
2021-04-06 21:10 ` Honnappa Nagarahalli
1 sibling, 1 reply; 24+ messages in thread
From: Luc Pelletier @ 2021-04-06 16:15 UTC (permalink / raw)
To: olivier.matz, jianfeng.tan, Honnappa.Nagarahalli
Cc: dev, Luc Pelletier, stable
The creation of control threads used a pthread barrier for
synchronization. This patch fixes a race condition where the pthread
barrier could get destroyed while one of the threads has not yet
returned from the pthread_barrier_wait function, which could result in
undefined behaviour. The barrier has been completely removed in favour
of a reference count on the control thread parameters struct.
Fixes: 3a0d465d4c53 ("eal: fix use-after-free on control thread creation")
Cc: jianfeng.tan@intel.com
Cc: stable@dpdk.org
Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>
---
Hi Olivier,
Hi Honnappa,
Thanks for your input Honnappa. I've made the changes to completely
remove the barrier. However, I didn't move the call to
pthread_setaffinity_np to the control thread; I think we want to report
the result of that function to the caller of rte_ctrl_thread_create and
doing so from ctrl_thread_init would be a lot trickier.
Olivier, what do you think of these changes?
lib/librte_eal/common/eal_common_thread.c | 35 ++++++++---------------
1 file changed, 12 insertions(+), 23 deletions(-)
diff --git a/lib/librte_eal/common/eal_common_thread.c b/lib/librte_eal/common/eal_common_thread.c
index 73a055902..2421066f9 100644
--- a/lib/librte_eal/common/eal_common_thread.c
+++ b/lib/librte_eal/common/eal_common_thread.c
@@ -169,12 +169,11 @@ __rte_thread_uninit(void)
struct rte_thread_ctrl_params {
void *(*start_routine)(void *);
void *arg;
- pthread_barrier_t configured;
+ unsigned int refcnt;
};
static void *ctrl_thread_init(void *arg)
{
- int ret;
struct internal_config *internal_conf =
eal_get_internal_configuration();
rte_cpuset_t *cpuset = &internal_conf->ctrl_cpuset;
@@ -184,11 +183,8 @@ static void *ctrl_thread_init(void *arg)
__rte_thread_init(rte_lcore_id(), cpuset);
- ret = pthread_barrier_wait(¶ms->configured);
- if (ret == PTHREAD_BARRIER_SERIAL_THREAD) {
- pthread_barrier_destroy(¶ms->configured);
+ if (__atomic_sub_fetch(¶ms->refcnt, 1, __ATOMIC_ACQ_REL) == 0)
free(params);
- }
return start_routine(routine_arg);
}
@@ -210,14 +206,11 @@ rte_ctrl_thread_create(pthread_t *thread, const char *name,
params->start_routine = start_routine;
params->arg = arg;
-
- pthread_barrier_init(¶ms->configured, NULL, 2);
+ params->refcnt = 2;
ret = pthread_create(thread, attr, ctrl_thread_init, (void *)params);
- if (ret != 0) {
- free(params);
- return -ret;
- }
+ if (ret != 0)
+ goto fail;
if (name != NULL) {
ret = rte_thread_setname(*thread, name);
@@ -228,24 +221,20 @@ rte_ctrl_thread_create(pthread_t *thread, const char *name,
ret = pthread_setaffinity_np(*thread, sizeof(*cpuset), cpuset);
if (ret)
- goto fail;
+ goto fail_cancel;
- ret = pthread_barrier_wait(¶ms->configured);
- if (ret == PTHREAD_BARRIER_SERIAL_THREAD) {
- pthread_barrier_destroy(¶ms->configured);
+ if (__atomic_sub_fetch(¶ms->refcnt, 1, __ATOMIC_ACQ_REL) == 0)
free(params);
- }
return 0;
-fail:
- if (PTHREAD_BARRIER_SERIAL_THREAD ==
- pthread_barrier_wait(¶ms->configured)) {
- pthread_barrier_destroy(¶ms->configured);
- free(params);
- }
+fail_cancel:
pthread_cancel(*thread);
pthread_join(*thread, NULL);
+
+fail:
+ free(params);
+
return -ret;
}
--
2.25.1
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [dpdk-stable] [PATCH v3] eal: fix race in ctrl thread creation
2021-04-06 16:15 ` [dpdk-stable] [PATCH v3] " Luc Pelletier
@ 2021-04-06 21:10 ` Honnappa Nagarahalli
2021-04-07 12:35 ` [dpdk-stable] [PATCH v4] " Luc Pelletier
0 siblings, 1 reply; 24+ messages in thread
From: Honnappa Nagarahalli @ 2021-04-06 21:10 UTC (permalink / raw)
To: Luc Pelletier, olivier.matz, jianfeng.tan
Cc: dev, stable, nd, Honnappa Nagarahalli, nd
<snip>
>
> The creation of control threads used a pthread barrier for synchronization.
> This patch fixes a race condition where the pthread barrier could get
> destroyed while one of the threads has not yet returned from the
> pthread_barrier_wait function, which could result in undefined behaviour.
> The barrier has been completely removed in favour of a reference count on
> the control thread parameters struct.
>
> Fixes: 3a0d465d4c53 ("eal: fix use-after-free on control thread creation")
> Cc: jianfeng.tan@intel.com
> Cc: stable@dpdk.org
>
> Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>
> ---
>
> Hi Olivier,
> Hi Honnappa,
>
> Thanks for your input Honnappa. I've made the changes to completely
> remove the barrier. However, I didn't move the call to pthread_setaffinity_np
I looked at the rte_ctrl_thread_create API definition. I am not sure if we have much leavy here as the API definition bakes in calling pthread_setaffinity_np.
With the barrier, the control thread did not execute its function till it was moved to the assigned core. If we remove the barrier and not move the pthread_setaffinity_np to ctrl_thread_init, the behavior might change.
One more comment below on how we are handling the setaffinity failure.
> to the control thread; I think we want to report the result of that function to
> the caller of rte_ctrl_thread_create and doing so from ctrl_thread_init would
> be a lot trickier.
>
> Olivier, what do you think of these changes?
>
> lib/librte_eal/common/eal_common_thread.c | 35 ++++++++---------------
> 1 file changed, 12 insertions(+), 23 deletions(-)
>
> diff --git a/lib/librte_eal/common/eal_common_thread.c
> b/lib/librte_eal/common/eal_common_thread.c
> index 73a055902..2421066f9 100644
> --- a/lib/librte_eal/common/eal_common_thread.c
> +++ b/lib/librte_eal/common/eal_common_thread.c
> @@ -169,12 +169,11 @@ __rte_thread_uninit(void) struct
> rte_thread_ctrl_params {
> void *(*start_routine)(void *);
> void *arg;
> - pthread_barrier_t configured;
> + unsigned int refcnt;
> };
>
> static void *ctrl_thread_init(void *arg) {
> - int ret;
> struct internal_config *internal_conf =
> eal_get_internal_configuration();
> rte_cpuset_t *cpuset = &internal_conf->ctrl_cpuset; @@ -184,11
> +183,8 @@ static void *ctrl_thread_init(void *arg)
>
> __rte_thread_init(rte_lcore_id(), cpuset);
>
> - ret = pthread_barrier_wait(¶ms->configured);
> - if (ret == PTHREAD_BARRIER_SERIAL_THREAD) {
> - pthread_barrier_destroy(¶ms->configured);
> + if (__atomic_sub_fetch(¶ms->refcnt, 1, __ATOMIC_ACQ_REL) ==
> 0)
> free(params);
> - }
>
> return start_routine(routine_arg);
> }
> @@ -210,14 +206,11 @@ rte_ctrl_thread_create(pthread_t *thread, const
> char *name,
>
> params->start_routine = start_routine;
> params->arg = arg;
> -
> - pthread_barrier_init(¶ms->configured, NULL, 2);
> + params->refcnt = 2;
>
> ret = pthread_create(thread, attr, ctrl_thread_init, (void *)params);
> - if (ret != 0) {
> - free(params);
> - return -ret;
> - }
> + if (ret != 0)
> + goto fail;
>
> if (name != NULL) {
> ret = rte_thread_setname(*thread, name); @@ -228,24
> +221,20 @@ rte_ctrl_thread_create(pthread_t *thread, const char *name,
>
> ret = pthread_setaffinity_np(*thread, sizeof(*cpuset), cpuset);
> if (ret)
> - goto fail;
> + goto fail_cancel;
>
> - ret = pthread_barrier_wait(¶ms->configured);
> - if (ret == PTHREAD_BARRIER_SERIAL_THREAD) {
> - pthread_barrier_destroy(¶ms->configured);
> + if (__atomic_sub_fetch(¶ms->refcnt, 1, __ATOMIC_ACQ_REL) ==
> 0)
> free(params);
> - }
>
> return 0;
>
> -fail:
> - if (PTHREAD_BARRIER_SERIAL_THREAD ==
> - pthread_barrier_wait(¶ms->configured)) {
> - pthread_barrier_destroy(¶ms->configured);
> - free(params);
> - }
> +fail_cancel:
> pthread_cancel(*thread);
For the control thread to be able to respond to pthread_cancel, it needs to call one of the library functions that is considered a cancellation point. I do not see such requirements in the API definition.
> pthread_join(*thread, NULL);
> +
> +fail:
> + free(params);
> +
> return -ret;
> }
>
> --
> 2.25.1
^ permalink raw reply [flat|nested] 24+ messages in thread
* [dpdk-stable] [PATCH v4] eal: fix race in ctrl thread creation
2021-04-06 21:10 ` Honnappa Nagarahalli
@ 2021-04-07 12:35 ` Luc Pelletier
2021-04-07 12:53 ` [dpdk-stable] [PATCH v5] " Luc Pelletier
0 siblings, 1 reply; 24+ messages in thread
From: Luc Pelletier @ 2021-04-07 12:35 UTC (permalink / raw)
To: olivier.matz, jianfeng.tan, Honnappa.Nagarahalli
Cc: dev, Luc Pelletier, stable
The creation of control threads uses a pthread barrier for
synchronization. This patch fixes a race condition where the pthread
barrier could get destroyed while one of the threads has not yet
returned from the pthread_barrier_wait function, which could result in
undefined behaviour.
Fixes: 3a0d465d4c53 ("eal: fix use-after-free on control thread creation")
Cc: jianfeng.tan@intel.com
Cc: stable@dpdk.org
Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>
---
Hi Olivier,
Hi Honnappa,
Thanks for your comments Honnappa. You make a very good point about the
cancellation point requirements. I've removed the pthread_cancel and
pthread_join calls. I've also added back the barrier to make sure the
control thread function only runs after the affinity has been set. If
setting the affinity fails, the control thread will run without getting
cancelled, but rte_ctrl_thread_create will return the error returned by
pthread_set_affinity_np.
I have also eliminated the duplication of the logic to decrement the
reference count and free the barrier & memory.
What do you think of the changes now?
lib/librte_eal/common/eal_common_thread.c | 51 +++++++++++------------
1 file changed, 24 insertions(+), 27 deletions(-)
diff --git a/lib/librte_eal/common/eal_common_thread.c b/lib/librte_eal/common/eal_common_thread.c
index 73a055902..2a8811582 100644
--- a/lib/librte_eal/common/eal_common_thread.c
+++ b/lib/librte_eal/common/eal_common_thread.c
@@ -170,11 +170,18 @@ struct rte_thread_ctrl_params {
void *(*start_routine)(void *);
void *arg;
pthread_barrier_t configured;
+ unsigned int refcnt;
};
+static void ctrl_params_free(struct rte_thread_ctrl_params* params) {
+ if (__atomic_sub_fetch(¶ms->refcnt, 1, __ATOMIC_ACQ_REL) == 0) {
+ pthread_barrier_destroy(¶ms->configured);
+ free(params);
+ }
+}
+
static void *ctrl_thread_init(void *arg)
{
- int ret;
struct internal_config *internal_conf =
eal_get_internal_configuration();
rte_cpuset_t *cpuset = &internal_conf->ctrl_cpuset;
@@ -184,11 +191,8 @@ static void *ctrl_thread_init(void *arg)
__rte_thread_init(rte_lcore_id(), cpuset);
- ret = pthread_barrier_wait(¶ms->configured);
- if (ret == PTHREAD_BARRIER_SERIAL_THREAD) {
- pthread_barrier_destroy(¶ms->configured);
- free(params);
- }
+ pthread_barrier_wait(¶ms->configured);
+ ctrl_params_free(params);
return start_routine(routine_arg);
}
@@ -210,14 +214,15 @@ rte_ctrl_thread_create(pthread_t *thread, const char *name,
params->start_routine = start_routine;
params->arg = arg;
+ params->refcnt = 2;
- pthread_barrier_init(¶ms->configured, NULL, 2);
+ ret = pthread_barrier_init(¶ms->configured, NULL, 2);
+ if (ret != 0)
+ goto fail_no_barrier;
ret = pthread_create(thread, attr, ctrl_thread_init, (void *)params);
- if (ret != 0) {
- free(params);
- return -ret;
- }
+ if (ret != 0)
+ goto fail_with_barrier;
if (name != NULL) {
ret = rte_thread_setname(*thread, name);
@@ -227,25 +232,17 @@ rte_ctrl_thread_create(pthread_t *thread, const char *name,
}
ret = pthread_setaffinity_np(*thread, sizeof(*cpuset), cpuset);
- if (ret)
- goto fail;
+ pthread_barrier_wait(¶ms->configured);
+ ctrl_params_free(params);
- ret = pthread_barrier_wait(¶ms->configured);
- if (ret == PTHREAD_BARRIER_SERIAL_THREAD) {
- pthread_barrier_destroy(¶ms->configured);
- free(params);
- }
+ return -ret;
- return 0;
+fail_with_barrier:
+ pthread_barrier_destroy(¶ms->configured);
+
+fail_no_barrier:
+ free(params);
-fail:
- if (PTHREAD_BARRIER_SERIAL_THREAD ==
- pthread_barrier_wait(¶ms->configured)) {
- pthread_barrier_destroy(¶ms->configured);
- free(params);
- }
- pthread_cancel(*thread);
- pthread_join(*thread, NULL);
return -ret;
}
--
2.25.1
^ permalink raw reply [flat|nested] 24+ messages in thread
* [dpdk-stable] [PATCH v5] eal: fix race in ctrl thread creation
2021-04-07 12:35 ` [dpdk-stable] [PATCH v4] " Luc Pelletier
@ 2021-04-07 12:53 ` Luc Pelletier
2021-04-07 13:22 ` Luc Pelletier
2021-04-07 13:31 ` Olivier Matz
0 siblings, 2 replies; 24+ messages in thread
From: Luc Pelletier @ 2021-04-07 12:53 UTC (permalink / raw)
To: olivier.matz, jianfeng.tan, Honnappa.Nagarahalli
Cc: dev, Luc Pelletier, stable
The creation of control threads uses a pthread barrier for
synchronization. This patch fixes a race condition where the pthread
barrier could get destroyed while one of the threads has not yet
returned from the pthread_barrier_wait function, which could result in
undefined behaviour.
Fixes: 3a0d465d4c53 ("eal: fix use-after-free on control thread creation")
Cc: jianfeng.tan@intel.com
Cc: stable@dpdk.org
Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>
---
Same as v4 except that I fixed 2 minor style issues flagged by patchwork.
lib/librte_eal/common/eal_common_thread.c | 52 +++++++++++------------
1 file changed, 25 insertions(+), 27 deletions(-)
diff --git a/lib/librte_eal/common/eal_common_thread.c b/lib/librte_eal/common/eal_common_thread.c
index 73a055902..c1044e795 100644
--- a/lib/librte_eal/common/eal_common_thread.c
+++ b/lib/librte_eal/common/eal_common_thread.c
@@ -170,11 +170,19 @@ struct rte_thread_ctrl_params {
void *(*start_routine)(void *);
void *arg;
pthread_barrier_t configured;
+ unsigned int refcnt;
};
+static void ctrl_params_free(struct rte_thread_ctrl_params *params)
+{
+ if (__atomic_sub_fetch(¶ms->refcnt, 1, __ATOMIC_ACQ_REL) == 0) {
+ pthread_barrier_destroy(¶ms->configured);
+ free(params);
+ }
+}
+
static void *ctrl_thread_init(void *arg)
{
- int ret;
struct internal_config *internal_conf =
eal_get_internal_configuration();
rte_cpuset_t *cpuset = &internal_conf->ctrl_cpuset;
@@ -184,11 +192,8 @@ static void *ctrl_thread_init(void *arg)
__rte_thread_init(rte_lcore_id(), cpuset);
- ret = pthread_barrier_wait(¶ms->configured);
- if (ret == PTHREAD_BARRIER_SERIAL_THREAD) {
- pthread_barrier_destroy(¶ms->configured);
- free(params);
- }
+ pthread_barrier_wait(¶ms->configured);
+ ctrl_params_free(params);
return start_routine(routine_arg);
}
@@ -210,14 +215,15 @@ rte_ctrl_thread_create(pthread_t *thread, const char *name,
params->start_routine = start_routine;
params->arg = arg;
+ params->refcnt = 2;
- pthread_barrier_init(¶ms->configured, NULL, 2);
+ ret = pthread_barrier_init(¶ms->configured, NULL, 2);
+ if (ret != 0)
+ goto fail_no_barrier;
ret = pthread_create(thread, attr, ctrl_thread_init, (void *)params);
- if (ret != 0) {
- free(params);
- return -ret;
- }
+ if (ret != 0)
+ goto fail_with_barrier;
if (name != NULL) {
ret = rte_thread_setname(*thread, name);
@@ -227,25 +233,17 @@ rte_ctrl_thread_create(pthread_t *thread, const char *name,
}
ret = pthread_setaffinity_np(*thread, sizeof(*cpuset), cpuset);
- if (ret)
- goto fail;
+ pthread_barrier_wait(¶ms->configured);
+ ctrl_params_free(params);
- ret = pthread_barrier_wait(¶ms->configured);
- if (ret == PTHREAD_BARRIER_SERIAL_THREAD) {
- pthread_barrier_destroy(¶ms->configured);
- free(params);
- }
+ return -ret;
- return 0;
+fail_with_barrier:
+ pthread_barrier_destroy(¶ms->configured);
+
+fail_no_barrier:
+ free(params);
-fail:
- if (PTHREAD_BARRIER_SERIAL_THREAD ==
- pthread_barrier_wait(¶ms->configured)) {
- pthread_barrier_destroy(¶ms->configured);
- free(params);
- }
- pthread_cancel(*thread);
- pthread_join(*thread, NULL);
return -ret;
}
--
2.25.1
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [dpdk-stable] [PATCH v5] eal: fix race in ctrl thread creation
2021-04-07 12:53 ` [dpdk-stable] [PATCH v5] " Luc Pelletier
@ 2021-04-07 13:22 ` Luc Pelletier
2021-04-07 13:31 ` Olivier Matz
1 sibling, 0 replies; 24+ messages in thread
From: Luc Pelletier @ 2021-04-07 13:22 UTC (permalink / raw)
To: Olivier Matz, jianfeng.tan, Honnappa.Nagarahalli; +Cc: dev, stable
Not directly related to this patch, but can someone please explain why
Patchwork is creating a new series everytime I post a new version of
the patch to this thread? I must be doing something wrong but I don't
know what it is. I have been using --in-reply-to with git send-email
but that's apparently not enough. Maybe I'm missing something but I
see several items in Patchwork for this single thread when, IIUC, it
should only be one.
Thank you.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [dpdk-stable] [PATCH v5] eal: fix race in ctrl thread creation
2021-04-07 12:53 ` [dpdk-stable] [PATCH v5] " Luc Pelletier
2021-04-07 13:22 ` Luc Pelletier
@ 2021-04-07 13:31 ` Olivier Matz
2021-04-07 14:42 ` [dpdk-stable] [PATCH v6] " Luc Pelletier
2021-04-07 15:15 ` [dpdk-stable] [PATCH v5] " Honnappa Nagarahalli
1 sibling, 2 replies; 24+ messages in thread
From: Olivier Matz @ 2021-04-07 13:31 UTC (permalink / raw)
To: Luc Pelletier; +Cc: jianfeng.tan, Honnappa.Nagarahalli, dev, stable
Hi Luc,
On Wed, Apr 07, 2021 at 08:53:23AM -0400, Luc Pelletier wrote:
> The creation of control threads uses a pthread barrier for
> synchronization. This patch fixes a race condition where the pthread
> barrier could get destroyed while one of the threads has not yet
> returned from the pthread_barrier_wait function, which could result in
> undefined behaviour.
>
> Fixes: 3a0d465d4c53 ("eal: fix use-after-free on control thread creation")
> Cc: jianfeng.tan@intel.com
> Cc: stable@dpdk.org
>
> Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>
> ---
>
> Same as v4 except that I fixed 2 minor style issues flagged by patchwork.
>
> lib/librte_eal/common/eal_common_thread.c | 52 +++++++++++------------
> 1 file changed, 25 insertions(+), 27 deletions(-)
>
> diff --git a/lib/librte_eal/common/eal_common_thread.c b/lib/librte_eal/common/eal_common_thread.c
> index 73a055902..c1044e795 100644
> --- a/lib/librte_eal/common/eal_common_thread.c
> +++ b/lib/librte_eal/common/eal_common_thread.c
> @@ -170,11 +170,19 @@ struct rte_thread_ctrl_params {
> void *(*start_routine)(void *);
> void *arg;
> pthread_barrier_t configured;
> + unsigned int refcnt;
> };
>
> +static void ctrl_params_free(struct rte_thread_ctrl_params *params)
> +{
> + if (__atomic_sub_fetch(¶ms->refcnt, 1, __ATOMIC_ACQ_REL) == 0) {
> + pthread_barrier_destroy(¶ms->configured);
> + free(params);
> + }
> +}
> +
> static void *ctrl_thread_init(void *arg)
> {
> - int ret;
> struct internal_config *internal_conf =
> eal_get_internal_configuration();
> rte_cpuset_t *cpuset = &internal_conf->ctrl_cpuset;
> @@ -184,11 +192,8 @@ static void *ctrl_thread_init(void *arg)
>
> __rte_thread_init(rte_lcore_id(), cpuset);
>
> - ret = pthread_barrier_wait(¶ms->configured);
> - if (ret == PTHREAD_BARRIER_SERIAL_THREAD) {
> - pthread_barrier_destroy(¶ms->configured);
> - free(params);
> - }
> + pthread_barrier_wait(¶ms->configured);
> + ctrl_params_free(params);
>
> return start_routine(routine_arg);
> }
> @@ -210,14 +215,15 @@ rte_ctrl_thread_create(pthread_t *thread, const char *name,
>
> params->start_routine = start_routine;
> params->arg = arg;
> + params->refcnt = 2;
>
> - pthread_barrier_init(¶ms->configured, NULL, 2);
> + ret = pthread_barrier_init(¶ms->configured, NULL, 2);
> + if (ret != 0)
> + goto fail_no_barrier;
>
> ret = pthread_create(thread, attr, ctrl_thread_init, (void *)params);
> - if (ret != 0) {
> - free(params);
> - return -ret;
> - }
> + if (ret != 0)
> + goto fail_with_barrier;
>
> if (name != NULL) {
> ret = rte_thread_setname(*thread, name);
> @@ -227,25 +233,17 @@ rte_ctrl_thread_create(pthread_t *thread, const char *name,
> }
>
> ret = pthread_setaffinity_np(*thread, sizeof(*cpuset), cpuset);
> - if (ret)
> - goto fail;
> + pthread_barrier_wait(¶ms->configured);
> + ctrl_params_free(params);
>
> - ret = pthread_barrier_wait(¶ms->configured);
> - if (ret == PTHREAD_BARRIER_SERIAL_THREAD) {
> - pthread_barrier_destroy(¶ms->configured);
> - free(params);
> - }
> + return -ret;
I think not killing the thread when pthread_setaffinity_np() returns an
error is not very understandable from the API user point of view.
What about doing this on top of your patch? The idea is to set
start_routine to NULL before the barrier if pthread_setaffinity_np()
failed. So there is no need to cancel the thread, it will exit by
itself.
@@ -187,14 +187,18 @@ static void *ctrl_thread_init(void *arg)
eal_get_internal_configuration();
rte_cpuset_t *cpuset = &internal_conf->ctrl_cpuset;
struct rte_thread_ctrl_params *params = arg;
- void *(*start_routine)(void *) = params->start_routine;
+ void *(*start_routine)(void *);
void *routine_arg = params->arg;
__rte_thread_init(rte_lcore_id(), cpuset);
pthread_barrier_wait(¶ms->configured);
+ start_routine = params->start_routine;
ctrl_params_free(params);
+ if (start_routine == NULL)
+ return NULL;
+
return start_routine(routine_arg);
}
@@ -233,10 +237,18 @@ rte_ctrl_thread_create(pthread_t *thread, const char *name,
}
ret = pthread_setaffinity_np(*thread, sizeof(*cpuset), cpuset);
+ if (ret != 0)
+ params->start_routine = NULL;
+
pthread_barrier_wait(¶ms->configured);
ctrl_params_free(params);
- return -ret;
+ if (ret != 0) {
+ pthread_join(*thread, NULL);
+ return -ret;
+ }
+
+ return 0;
fail_with_barrier:
pthread_barrier_destroy(¶ms->configured);
Regards,
Olivier
^ permalink raw reply [flat|nested] 24+ messages in thread
* [dpdk-stable] [PATCH v6] eal: fix race in ctrl thread creation
2021-04-07 13:31 ` Olivier Matz
@ 2021-04-07 14:42 ` Luc Pelletier
2021-04-07 14:57 ` Olivier Matz
2021-04-07 15:15 ` [dpdk-stable] [PATCH v5] " Honnappa Nagarahalli
1 sibling, 1 reply; 24+ messages in thread
From: Luc Pelletier @ 2021-04-07 14:42 UTC (permalink / raw)
To: olivier.matz, jianfeng.tan, Honnappa.Nagarahalli
Cc: dev, Luc Pelletier, stable
The creation of control threads uses a pthread barrier for
synchronization. This patch fixes a race condition where the pthread
barrier could get destroyed while one of the threads has not yet
returned from the pthread_barrier_wait function, which could result in
undefined behaviour.
Fixes: 3a0d465d4c53 ("eal: fix use-after-free on control thread creation")
Cc: jianfeng.tan@intel.com
Cc: stable@dpdk.org
Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>
---
Hi Olivier,
I've made the changes as you requested. However, I'm using the atomic
built-ins for reading and writing start_routine; I think they're
required to prevent any re-reordering.
Please let me know what you think.
lib/librte_eal/common/eal_common_thread.c | 63 +++++++++++++----------
1 file changed, 35 insertions(+), 28 deletions(-)
diff --git a/lib/librte_eal/common/eal_common_thread.c b/lib/librte_eal/common/eal_common_thread.c
index 73a055902..fcb386f77 100644
--- a/lib/librte_eal/common/eal_common_thread.c
+++ b/lib/librte_eal/common/eal_common_thread.c
@@ -170,25 +170,34 @@ struct rte_thread_ctrl_params {
void *(*start_routine)(void *);
void *arg;
pthread_barrier_t configured;
+ unsigned int refcnt;
};
+static void ctrl_params_free(struct rte_thread_ctrl_params *params)
+{
+ if (__atomic_sub_fetch(¶ms->refcnt, 1, __ATOMIC_ACQ_REL) == 0) {
+ pthread_barrier_destroy(¶ms->configured);
+ free(params);
+ }
+}
+
static void *ctrl_thread_init(void *arg)
{
- int ret;
struct internal_config *internal_conf =
eal_get_internal_configuration();
rte_cpuset_t *cpuset = &internal_conf->ctrl_cpuset;
struct rte_thread_ctrl_params *params = arg;
- void *(*start_routine)(void *) = params->start_routine;
+ void *(*start_routine)(void *);
void *routine_arg = params->arg;
__rte_thread_init(rte_lcore_id(), cpuset);
- ret = pthread_barrier_wait(¶ms->configured);
- if (ret == PTHREAD_BARRIER_SERIAL_THREAD) {
- pthread_barrier_destroy(¶ms->configured);
- free(params);
- }
+ pthread_barrier_wait(¶ms->configured);
+ start_routine = __atomic_load_n(¶ms->start_routine, __ATOMIC_ACQUIRE);
+ ctrl_params_free(params);
+
+ if (start_routine == NULL)
+ return NULL;
return start_routine(routine_arg);
}
@@ -210,14 +219,15 @@ rte_ctrl_thread_create(pthread_t *thread, const char *name,
params->start_routine = start_routine;
params->arg = arg;
+ params->refcnt = 2;
- pthread_barrier_init(¶ms->configured, NULL, 2);
+ ret = pthread_barrier_init(¶ms->configured, NULL, 2);
+ if (ret != 0)
+ goto fail_no_barrier;
ret = pthread_create(thread, attr, ctrl_thread_init, (void *)params);
- if (ret != 0) {
- free(params);
- return -ret;
- }
+ if (ret != 0)
+ goto fail_with_barrier;
if (name != NULL) {
ret = rte_thread_setname(*thread, name);
@@ -227,25 +237,22 @@ rte_ctrl_thread_create(pthread_t *thread, const char *name,
}
ret = pthread_setaffinity_np(*thread, sizeof(*cpuset), cpuset);
- if (ret)
- goto fail;
+ if (ret != 0)
+ __atomic_store_n(¶ms->start_routine, NULL, __ATOMIC_RELEASE);
+ pthread_barrier_wait(¶ms->configured);
+ ctrl_params_free(params);
- ret = pthread_barrier_wait(¶ms->configured);
- if (ret == PTHREAD_BARRIER_SERIAL_THREAD) {
- pthread_barrier_destroy(¶ms->configured);
- free(params);
- }
+ if (ret != 0)
+ pthread_join(*thread, NULL);
- return 0;
+ return -ret;
+
+fail_with_barrier:
+ pthread_barrier_destroy(¶ms->configured);
+
+fail_no_barrier:
+ free(params);
-fail:
- if (PTHREAD_BARRIER_SERIAL_THREAD ==
- pthread_barrier_wait(¶ms->configured)) {
- pthread_barrier_destroy(¶ms->configured);
- free(params);
- }
- pthread_cancel(*thread);
- pthread_join(*thread, NULL);
return -ret;
}
--
2.25.1
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [dpdk-stable] [PATCH v6] eal: fix race in ctrl thread creation
2021-04-07 14:42 ` [dpdk-stable] [PATCH v6] " Luc Pelletier
@ 2021-04-07 14:57 ` Olivier Matz
2021-04-07 15:29 ` [dpdk-stable] [PATCH v7] " Luc Pelletier
0 siblings, 1 reply; 24+ messages in thread
From: Olivier Matz @ 2021-04-07 14:57 UTC (permalink / raw)
To: Luc Pelletier; +Cc: jianfeng.tan, Honnappa.Nagarahalli, dev, stable
Hi Luc,
On Wed, Apr 07, 2021 at 10:42:37AM -0400, Luc Pelletier wrote:
> The creation of control threads uses a pthread barrier for
> synchronization. This patch fixes a race condition where the pthread
> barrier could get destroyed while one of the threads has not yet
> returned from the pthread_barrier_wait function, which could result in
> undefined behaviour.
>
> Fixes: 3a0d465d4c53 ("eal: fix use-after-free on control thread creation")
> Cc: jianfeng.tan@intel.com
> Cc: stable@dpdk.org
>
> Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>
> ---
>
> Hi Olivier,
>
> I've made the changes as you requested. However, I'm using the atomic
> built-ins for reading and writing start_routine; I think they're
> required to prevent any re-reordering.
>
> Please let me know what you think.
From [1], it seems that pthread_barrier_wait() is a full memory barrier.
So while not wrong, I think using atomic built-ins it is not needed.
[1] https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap04.html#tag_04_11
>
> lib/librte_eal/common/eal_common_thread.c | 63 +++++++++++++----------
> 1 file changed, 35 insertions(+), 28 deletions(-)
>
> diff --git a/lib/librte_eal/common/eal_common_thread.c b/lib/librte_eal/common/eal_common_thread.c
> index 73a055902..fcb386f77 100644
> --- a/lib/librte_eal/common/eal_common_thread.c
> +++ b/lib/librte_eal/common/eal_common_thread.c
> @@ -170,25 +170,34 @@ struct rte_thread_ctrl_params {
> void *(*start_routine)(void *);
> void *arg;
> pthread_barrier_t configured;
> + unsigned int refcnt;
> };
>
> +static void ctrl_params_free(struct rte_thread_ctrl_params *params)
> +{
> + if (__atomic_sub_fetch(¶ms->refcnt, 1, __ATOMIC_ACQ_REL) == 0) {
> + pthread_barrier_destroy(¶ms->configured);
> + free(params);
> + }
> +}
> +
> static void *ctrl_thread_init(void *arg)
> {
> - int ret;
> struct internal_config *internal_conf =
> eal_get_internal_configuration();
> rte_cpuset_t *cpuset = &internal_conf->ctrl_cpuset;
> struct rte_thread_ctrl_params *params = arg;
> - void *(*start_routine)(void *) = params->start_routine;
> + void *(*start_routine)(void *);
> void *routine_arg = params->arg;
>
> __rte_thread_init(rte_lcore_id(), cpuset);
>
> - ret = pthread_barrier_wait(¶ms->configured);
> - if (ret == PTHREAD_BARRIER_SERIAL_THREAD) {
> - pthread_barrier_destroy(¶ms->configured);
> - free(params);
> - }
> + pthread_barrier_wait(¶ms->configured);
> + start_routine = __atomic_load_n(¶ms->start_routine, __ATOMIC_ACQUIRE);
> + ctrl_params_free(params);
> +
> + if (start_routine == NULL)
> + return NULL;
>
> return start_routine(routine_arg);
> }
> @@ -210,14 +219,15 @@ rte_ctrl_thread_create(pthread_t *thread, const char *name,
>
> params->start_routine = start_routine;
> params->arg = arg;
> + params->refcnt = 2;
>
> - pthread_barrier_init(¶ms->configured, NULL, 2);
> + ret = pthread_barrier_init(¶ms->configured, NULL, 2);
> + if (ret != 0)
> + goto fail_no_barrier;
>
> ret = pthread_create(thread, attr, ctrl_thread_init, (void *)params);
> - if (ret != 0) {
> - free(params);
> - return -ret;
> - }
> + if (ret != 0)
> + goto fail_with_barrier;
>
> if (name != NULL) {
> ret = rte_thread_setname(*thread, name);
> @@ -227,25 +237,22 @@ rte_ctrl_thread_create(pthread_t *thread, const char *name,
> }
>
> ret = pthread_setaffinity_np(*thread, sizeof(*cpuset), cpuset);
> - if (ret)
> - goto fail;
> + if (ret != 0)
> + __atomic_store_n(¶ms->start_routine, NULL, __ATOMIC_RELEASE);
> + pthread_barrier_wait(¶ms->configured);
> + ctrl_params_free(params);
>
> - ret = pthread_barrier_wait(¶ms->configured);
> - if (ret == PTHREAD_BARRIER_SERIAL_THREAD) {
> - pthread_barrier_destroy(¶ms->configured);
> - free(params);
> - }
> + if (ret != 0)
> + pthread_join(*thread, NULL);
>
> - return 0;
> + return -ret;
> +
> +fail_with_barrier:
> + pthread_barrier_destroy(¶ms->configured);
> +
> +fail_no_barrier:
> + free(params);
>
> -fail:
> - if (PTHREAD_BARRIER_SERIAL_THREAD ==
> - pthread_barrier_wait(¶ms->configured)) {
> - pthread_barrier_destroy(¶ms->configured);
> - free(params);
> - }
> - pthread_cancel(*thread);
> - pthread_join(*thread, NULL);
> return -ret;
> }
>
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [dpdk-stable] [PATCH v5] eal: fix race in ctrl thread creation
2021-04-07 13:31 ` Olivier Matz
2021-04-07 14:42 ` [dpdk-stable] [PATCH v6] " Luc Pelletier
@ 2021-04-07 15:15 ` Honnappa Nagarahalli
2021-04-07 20:16 ` [dpdk-stable] [PATCH 1/2] " Luc Pelletier
2021-04-07 20:16 ` [dpdk-stable] [PATCH 2/2] eal: fix hang in ctrl thread creation error logic Luc Pelletier
1 sibling, 2 replies; 24+ messages in thread
From: Honnappa Nagarahalli @ 2021-04-07 15:15 UTC (permalink / raw)
To: Olivier Matz, Luc Pelletier
Cc: jianfeng.tan, dev, stable, nd, Honnappa Nagarahalli, nd
<snip>
>
> Hi Luc,
>
> On Wed, Apr 07, 2021 at 08:53:23AM -0400, Luc Pelletier wrote:
> > The creation of control threads uses a pthread barrier for
> > synchronization. This patch fixes a race condition where the pthread
> > barrier could get destroyed while one of the threads has not yet
> > returned from the pthread_barrier_wait function, which could result in
> > undefined behaviour.
> >
> > Fixes: 3a0d465d4c53 ("eal: fix use-after-free on control thread
> > creation")
> > Cc: jianfeng.tan@intel.com
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>
> > ---
> >
> > Same as v4 except that I fixed 2 minor style issues flagged by patchwork.
> >
> > lib/librte_eal/common/eal_common_thread.c | 52
> > +++++++++++------------
> > 1 file changed, 25 insertions(+), 27 deletions(-)
> >
> > diff --git a/lib/librte_eal/common/eal_common_thread.c
> > b/lib/librte_eal/common/eal_common_thread.c
> > index 73a055902..c1044e795 100644
> > --- a/lib/librte_eal/common/eal_common_thread.c
> > +++ b/lib/librte_eal/common/eal_common_thread.c
> > @@ -170,11 +170,19 @@ struct rte_thread_ctrl_params {
> > void *(*start_routine)(void *);
> > void *arg;
> > pthread_barrier_t configured;
> > + unsigned int refcnt;
> > };
> >
> > +static void ctrl_params_free(struct rte_thread_ctrl_params *params) {
> > + if (__atomic_sub_fetch(¶ms->refcnt, 1, __ATOMIC_ACQ_REL) ==
> 0) {
> > + pthread_barrier_destroy(¶ms->configured);
> > + free(params);
> > + }
> > +}
> > +
> > static void *ctrl_thread_init(void *arg) {
> > - int ret;
> > struct internal_config *internal_conf =
> > eal_get_internal_configuration();
> > rte_cpuset_t *cpuset = &internal_conf->ctrl_cpuset; @@ -184,11
> > +192,8 @@ static void *ctrl_thread_init(void *arg)
> >
> > __rte_thread_init(rte_lcore_id(), cpuset);
> >
> > - ret = pthread_barrier_wait(¶ms->configured);
> > - if (ret == PTHREAD_BARRIER_SERIAL_THREAD) {
> > - pthread_barrier_destroy(¶ms->configured);
> > - free(params);
> > - }
> > + pthread_barrier_wait(¶ms->configured);
> > + ctrl_params_free(params);
> >
> > return start_routine(routine_arg);
> > }
> > @@ -210,14 +215,15 @@ rte_ctrl_thread_create(pthread_t *thread, const
> > char *name,
> >
> > params->start_routine = start_routine;
> > params->arg = arg;
> > + params->refcnt = 2;
> >
> > - pthread_barrier_init(¶ms->configured, NULL, 2);
> > + ret = pthread_barrier_init(¶ms->configured, NULL, 2);
> > + if (ret != 0)
> > + goto fail_no_barrier;
> >
> > ret = pthread_create(thread, attr, ctrl_thread_init, (void *)params);
> > - if (ret != 0) {
> > - free(params);
> > - return -ret;
> > - }
> > + if (ret != 0)
> > + goto fail_with_barrier;
> >
> > if (name != NULL) {
> > ret = rte_thread_setname(*thread, name); @@ -227,25
> +233,17 @@
> > rte_ctrl_thread_create(pthread_t *thread, const char *name,
> > }
> >
> > ret = pthread_setaffinity_np(*thread, sizeof(*cpuset), cpuset);
> > - if (ret)
> > - goto fail;
> > + pthread_barrier_wait(¶ms->configured);
> > + ctrl_params_free(params);
> >
> > - ret = pthread_barrier_wait(¶ms->configured);
> > - if (ret == PTHREAD_BARRIER_SERIAL_THREAD) {
> > - pthread_barrier_destroy(¶ms->configured);
> > - free(params);
> > - }
> > + return -ret;
>
> I think not killing the thread when pthread_setaffinity_np() returns an error is
> not very understandable from the API user point of view.
Agree.
>
> What about doing this on top of your patch? The idea is to set start_routine
> to NULL before the barrier if pthread_setaffinity_np() failed. So there is no
> need to cancel the thread, it will exit by itself.
How about using the pthread_attr_setaffinity_np API?
It is deviating from the documentation of the 'rte_ctrl_thread_create'. But, from the user perspective, the behavior should not change.
This way we do not have to handle the error after the thread is launched.
>
> @@ -187,14 +187,18 @@ static void *ctrl_thread_init(void *arg)
> eal_get_internal_configuration();
> rte_cpuset_t *cpuset = &internal_conf->ctrl_cpuset;
> struct rte_thread_ctrl_params *params = arg;
> - void *(*start_routine)(void *) = params->start_routine;
> + void *(*start_routine)(void *);
> void *routine_arg = params->arg;
>
> __rte_thread_init(rte_lcore_id(), cpuset);
>
> pthread_barrier_wait(¶ms->configured);
> + start_routine = params->start_routine;
> ctrl_params_free(params);
>
> + if (start_routine == NULL)
> + return NULL;
> +
> return start_routine(routine_arg);
> }
>
> @@ -233,10 +237,18 @@ rte_ctrl_thread_create(pthread_t *thread, const
> char *name,
> }
>
> ret = pthread_setaffinity_np(*thread, sizeof(*cpuset), cpuset);
> + if (ret != 0)
> + params->start_routine = NULL;
> +
> pthread_barrier_wait(¶ms->configured);
> ctrl_params_free(params);
>
> - return -ret;
> + if (ret != 0) {
> + pthread_join(*thread, NULL);
> + return -ret;
> + }
> +
> + return 0;
>
> fail_with_barrier:
> pthread_barrier_destroy(¶ms->configured);
>
>
> Regards,
> Olivier
^ permalink raw reply [flat|nested] 24+ messages in thread
* [dpdk-stable] [PATCH v7] eal: fix race in ctrl thread creation
2021-04-07 14:57 ` Olivier Matz
@ 2021-04-07 15:29 ` Luc Pelletier
2021-04-07 17:15 ` Honnappa Nagarahalli
0 siblings, 1 reply; 24+ messages in thread
From: Luc Pelletier @ 2021-04-07 15:29 UTC (permalink / raw)
To: olivier.matz, jianfeng.tan, Honnappa.Nagarahalli
Cc: dev, Luc Pelletier, stable
The creation of control threads uses a pthread barrier for
synchronization. This patch fixes a race condition where the pthread
barrier could get destroyed while one of the threads has not yet
returned from the pthread_barrier_wait function, which could result in
undefined behaviour.
Fixes: 3a0d465d4c53 ("eal: fix use-after-free on control thread creation")
Cc: jianfeng.tan@intel.com
Cc: stable@dpdk.org
Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>
---
Hi Olivier,
Olivier, thanks for pointing out that pthread_barrier_wait acts as a
full memory barrier; I didn't know that was explicitly specified in the
documentation. I've changed it to use a regular read/write.
Hi Honnappa,
I have not changed the code to use pthread_attr_setaffinity_np because
the attr parameter is const. I could make a copy of the provided
attributes and use that instead, but I think this would still violate
the spirit of the API and, AFAICT, there's no official mechanism for
copying a pthread_attr_t.
Please let me know what you think.
lib/librte_eal/common/eal_common_thread.c | 63 +++++++++++++----------
1 file changed, 35 insertions(+), 28 deletions(-)
diff --git a/lib/librte_eal/common/eal_common_thread.c b/lib/librte_eal/common/eal_common_thread.c
index 73a055902..d4e09f84b 100644
--- a/lib/librte_eal/common/eal_common_thread.c
+++ b/lib/librte_eal/common/eal_common_thread.c
@@ -170,25 +170,34 @@ struct rte_thread_ctrl_params {
void *(*start_routine)(void *);
void *arg;
pthread_barrier_t configured;
+ unsigned int refcnt;
};
+static void ctrl_params_free(struct rte_thread_ctrl_params *params)
+{
+ if (__atomic_sub_fetch(¶ms->refcnt, 1, __ATOMIC_ACQ_REL) == 0) {
+ pthread_barrier_destroy(¶ms->configured);
+ free(params);
+ }
+}
+
static void *ctrl_thread_init(void *arg)
{
- int ret;
struct internal_config *internal_conf =
eal_get_internal_configuration();
rte_cpuset_t *cpuset = &internal_conf->ctrl_cpuset;
struct rte_thread_ctrl_params *params = arg;
- void *(*start_routine)(void *) = params->start_routine;
+ void *(*start_routine)(void *);
void *routine_arg = params->arg;
__rte_thread_init(rte_lcore_id(), cpuset);
- ret = pthread_barrier_wait(¶ms->configured);
- if (ret == PTHREAD_BARRIER_SERIAL_THREAD) {
- pthread_barrier_destroy(¶ms->configured);
- free(params);
- }
+ pthread_barrier_wait(¶ms->configured);
+ start_routine = params->start_routine;
+ ctrl_params_free(params);
+
+ if (start_routine == NULL)
+ return NULL;
return start_routine(routine_arg);
}
@@ -210,14 +219,15 @@ rte_ctrl_thread_create(pthread_t *thread, const char *name,
params->start_routine = start_routine;
params->arg = arg;
+ params->refcnt = 2;
- pthread_barrier_init(¶ms->configured, NULL, 2);
+ ret = pthread_barrier_init(¶ms->configured, NULL, 2);
+ if (ret != 0)
+ goto fail_no_barrier;
ret = pthread_create(thread, attr, ctrl_thread_init, (void *)params);
- if (ret != 0) {
- free(params);
- return -ret;
- }
+ if (ret != 0)
+ goto fail_with_barrier;
if (name != NULL) {
ret = rte_thread_setname(*thread, name);
@@ -227,25 +237,22 @@ rte_ctrl_thread_create(pthread_t *thread, const char *name,
}
ret = pthread_setaffinity_np(*thread, sizeof(*cpuset), cpuset);
- if (ret)
- goto fail;
+ if (ret != 0)
+ params->start_routine = NULL;
+ pthread_barrier_wait(¶ms->configured);
+ ctrl_params_free(params);
- ret = pthread_barrier_wait(¶ms->configured);
- if (ret == PTHREAD_BARRIER_SERIAL_THREAD) {
- pthread_barrier_destroy(¶ms->configured);
- free(params);
- }
+ if (ret != 0)
+ pthread_join(*thread, NULL);
- return 0;
+ return -ret;
+
+fail_with_barrier:
+ pthread_barrier_destroy(¶ms->configured);
+
+fail_no_barrier:
+ free(params);
-fail:
- if (PTHREAD_BARRIER_SERIAL_THREAD ==
- pthread_barrier_wait(¶ms->configured)) {
- pthread_barrier_destroy(¶ms->configured);
- free(params);
- }
- pthread_cancel(*thread);
- pthread_join(*thread, NULL);
return -ret;
}
--
2.25.1
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [dpdk-stable] [PATCH v7] eal: fix race in ctrl thread creation
2021-04-07 15:29 ` [dpdk-stable] [PATCH v7] " Luc Pelletier
@ 2021-04-07 17:15 ` Honnappa Nagarahalli
0 siblings, 0 replies; 24+ messages in thread
From: Honnappa Nagarahalli @ 2021-04-07 17:15 UTC (permalink / raw)
To: Luc Pelletier, olivier.matz, jianfeng.tan
Cc: dev, stable, nd, Honnappa Nagarahalli, nd
<snip>
> Subject: [PATCH v7] eal: fix race in ctrl thread creation
>
> The creation of control threads uses a pthread barrier for synchronization.
> This patch fixes a race condition where the pthread barrier could get
> destroyed while one of the threads has not yet returned from the
> pthread_barrier_wait function, which could result in undefined behaviour.
>
> Fixes: 3a0d465d4c53 ("eal: fix use-after-free on control thread creation")
> Cc: jianfeng.tan@intel.com
> Cc: stable@dpdk.org
>
> Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>
Overall, this LGTM.
Just wondering if this should be split into 2 commits.
1) Fix the race condition with refcnt.
2) Remove the use of pthread_cancel
I have some observations below.
> ---
>
> Hi Olivier,
>
> Olivier, thanks for pointing out that pthread_barrier_wait acts as a full
> memory barrier; I didn't know that was explicitly specified in the
"full memory barrier" just says that the memory operations prior and after the barrier will not be mixed. It does not guarantee that the modification done by one core are 'visible' to other cores. Additional code is required to ensure that the modifications are visible. For ex: in the following code setting start_routine=NULL is not guaranteed to be visible to other cores if pthread_barrier_wait is just a full barrier.
The particular wording used in the reference is "synchronize memory with respect to other threads". I am not exactly sure what is means, but may be safe to assume it means the updates are 'visible' to other cores (as the locking APIs are part of the list and they guarantee the visibility aspects).
May be we can simplify this during 21.11 release as we will have the ability to make more invasive changes.
> documentation. I've changed it to use a regular read/write.
>
> Hi Honnappa,
>
> I have not changed the code to use pthread_attr_setaffinity_np because the
> attr parameter is const. I could make a copy of the provided attributes and
> use that instead, but I think this would still violate the spirit of the API and,
> AFAICT, there's no official mechanism for copying a pthread_attr_t.
Ok, thank you.
>
> Please let me know what you think.
>
> lib/librte_eal/common/eal_common_thread.c | 63 +++++++++++++----------
> 1 file changed, 35 insertions(+), 28 deletions(-)
>
> diff --git a/lib/librte_eal/common/eal_common_thread.c
> b/lib/librte_eal/common/eal_common_thread.c
> index 73a055902..d4e09f84b 100644
> --- a/lib/librte_eal/common/eal_common_thread.c
> +++ b/lib/librte_eal/common/eal_common_thread.c
> @@ -170,25 +170,34 @@ struct rte_thread_ctrl_params {
> void *(*start_routine)(void *);
> void *arg;
> pthread_barrier_t configured;
> + unsigned int refcnt;
> };
>
> +static void ctrl_params_free(struct rte_thread_ctrl_params *params) {
> + if (__atomic_sub_fetch(¶ms->refcnt, 1, __ATOMIC_ACQ_REL) ==
> 0) {
> + pthread_barrier_destroy(¶ms->configured);
> + free(params);
> + }
> +}
> +
> static void *ctrl_thread_init(void *arg) {
> - int ret;
> struct internal_config *internal_conf =
> eal_get_internal_configuration();
> rte_cpuset_t *cpuset = &internal_conf->ctrl_cpuset;
> struct rte_thread_ctrl_params *params = arg;
> - void *(*start_routine)(void *) = params->start_routine;
> + void *(*start_routine)(void *);
> void *routine_arg = params->arg;
>
> __rte_thread_init(rte_lcore_id(), cpuset);
>
> - ret = pthread_barrier_wait(¶ms->configured);
> - if (ret == PTHREAD_BARRIER_SERIAL_THREAD) {
> - pthread_barrier_destroy(¶ms->configured);
> - free(params);
> - }
> + pthread_barrier_wait(¶ms->configured);
> + start_routine = params->start_routine;
> + ctrl_params_free(params);
> +
> + if (start_routine == NULL)
> + return NULL;
>
> return start_routine(routine_arg);
> }
> @@ -210,14 +219,15 @@ rte_ctrl_thread_create(pthread_t *thread, const
> char *name,
>
> params->start_routine = start_routine;
> params->arg = arg;
> + params->refcnt = 2;
>
> - pthread_barrier_init(¶ms->configured, NULL, 2);
> + ret = pthread_barrier_init(¶ms->configured, NULL, 2);
> + if (ret != 0)
> + goto fail_no_barrier;
>
> ret = pthread_create(thread, attr, ctrl_thread_init, (void *)params);
> - if (ret != 0) {
> - free(params);
> - return -ret;
> - }
> + if (ret != 0)
> + goto fail_with_barrier;
>
> if (name != NULL) {
> ret = rte_thread_setname(*thread, name); @@ -227,25
> +237,22 @@ rte_ctrl_thread_create(pthread_t *thread, const char *name,
> }
>
> ret = pthread_setaffinity_np(*thread, sizeof(*cpuset), cpuset);
> - if (ret)
> - goto fail;
> + if (ret != 0)
> + params->start_routine = NULL;
> + pthread_barrier_wait(¶ms->configured);
> + ctrl_params_free(params);
>
> - ret = pthread_barrier_wait(¶ms->configured);
> - if (ret == PTHREAD_BARRIER_SERIAL_THREAD) {
> - pthread_barrier_destroy(¶ms->configured);
> - free(params);
> - }
> + if (ret != 0)
A comment here on why the pthread_join will not wait indefinitely will help. Splitting the patch will also help understand the changes better.
> + pthread_join(*thread, NULL);
>
> - return 0;
> + return -ret;
> +
> +fail_with_barrier:
> + pthread_barrier_destroy(¶ms->configured);
> +
> +fail_no_barrier:
> + free(params);
>
> -fail:
> - if (PTHREAD_BARRIER_SERIAL_THREAD ==
> - pthread_barrier_wait(¶ms->configured)) {
> - pthread_barrier_destroy(¶ms->configured);
> - free(params);
> - }
> - pthread_cancel(*thread);
> - pthread_join(*thread, NULL);
> return -ret;
> }
>
> --
> 2.25.1
^ permalink raw reply [flat|nested] 24+ messages in thread
* [dpdk-stable] [PATCH 1/2] eal: fix race in ctrl thread creation
2021-04-07 15:15 ` [dpdk-stable] [PATCH v5] " Honnappa Nagarahalli
@ 2021-04-07 20:16 ` Luc Pelletier
2021-04-08 14:17 ` Olivier Matz
2021-04-08 17:06 ` Honnappa Nagarahalli
2021-04-07 20:16 ` [dpdk-stable] [PATCH 2/2] eal: fix hang in ctrl thread creation error logic Luc Pelletier
1 sibling, 2 replies; 24+ messages in thread
From: Luc Pelletier @ 2021-04-07 20:16 UTC (permalink / raw)
To: olivier.matz, jianfeng.tan, Honnappa.Nagarahalli
Cc: dev, Luc Pelletier, stable
The creation of control threads uses a pthread barrier for
synchronization. This patch fixes a race condition where the pthread
barrier could get destroyed while one of the threads has not yet
returned from the pthread_barrier_wait function, which could result in
undefined behaviour.
Fixes: 3a0d465 ("eal: fix use-after-free on control thread creation")
Cc: jianfeng.tan@intel.com
Cc: stable@dpdk.org
Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>
---
Hi Olivier,
Hi Honnappa,
As discussed, I've split the changes into 2 patches. This first commit
fixes the race condition but leaves in the pthread_cancel call.
lib/librte_eal/common/eal_common_thread.c | 49 +++++++++++++----------
1 file changed, 27 insertions(+), 22 deletions(-)
diff --git a/lib/librte_eal/common/eal_common_thread.c b/lib/librte_eal/common/eal_common_thread.c
index 73a055902..3347e91bf 100644
--- a/lib/librte_eal/common/eal_common_thread.c
+++ b/lib/librte_eal/common/eal_common_thread.c
@@ -170,11 +170,19 @@ struct rte_thread_ctrl_params {
void *(*start_routine)(void *);
void *arg;
pthread_barrier_t configured;
+ unsigned int refcnt;
};
+static void ctrl_params_free(struct rte_thread_ctrl_params *params)
+{
+ if (__atomic_sub_fetch(¶ms->refcnt, 1, __ATOMIC_ACQ_REL) == 0) {
+ pthread_barrier_destroy(¶ms->configured);
+ free(params);
+ }
+}
+
static void *ctrl_thread_init(void *arg)
{
- int ret;
struct internal_config *internal_conf =
eal_get_internal_configuration();
rte_cpuset_t *cpuset = &internal_conf->ctrl_cpuset;
@@ -184,11 +192,8 @@ static void *ctrl_thread_init(void *arg)
__rte_thread_init(rte_lcore_id(), cpuset);
- ret = pthread_barrier_wait(¶ms->configured);
- if (ret == PTHREAD_BARRIER_SERIAL_THREAD) {
- pthread_barrier_destroy(¶ms->configured);
- free(params);
- }
+ pthread_barrier_wait(¶ms->configured);
+ ctrl_params_free(params);
return start_routine(routine_arg);
}
@@ -210,15 +215,18 @@ rte_ctrl_thread_create(pthread_t *thread, const char *name,
params->start_routine = start_routine;
params->arg = arg;
+ params->refcnt = 2;
- pthread_barrier_init(¶ms->configured, NULL, 2);
-
- ret = pthread_create(thread, attr, ctrl_thread_init, (void *)params);
+ ret = pthread_barrier_init(¶ms->configured, NULL, 2);
if (ret != 0) {
free(params);
return -ret;
}
+ ret = pthread_create(thread, attr, ctrl_thread_init, (void *)params);
+ if (ret != 0)
+ goto fail;
+
if (name != NULL) {
ret = rte_thread_setname(*thread, name);
if (ret < 0)
@@ -227,25 +235,22 @@ rte_ctrl_thread_create(pthread_t *thread, const char *name,
}
ret = pthread_setaffinity_np(*thread, sizeof(*cpuset), cpuset);
- if (ret)
- goto fail;
+ if (ret != 0)
+ goto fail_cancel;
- ret = pthread_barrier_wait(¶ms->configured);
- if (ret == PTHREAD_BARRIER_SERIAL_THREAD) {
- pthread_barrier_destroy(¶ms->configured);
- free(params);
- }
+ pthread_barrier_wait(¶ms->configured);
+ ctrl_params_free(params);
return 0;
-fail:
- if (PTHREAD_BARRIER_SERIAL_THREAD ==
- pthread_barrier_wait(¶ms->configured)) {
- pthread_barrier_destroy(¶ms->configured);
- free(params);
- }
+fail_cancel:
pthread_cancel(*thread);
pthread_join(*thread, NULL);
+
+fail:
+ pthread_barrier_destroy(¶ms->configured);
+ free(params);
+
return -ret;
}
--
2.25.1
^ permalink raw reply [flat|nested] 24+ messages in thread
* [dpdk-stable] [PATCH 2/2] eal: fix hang in ctrl thread creation error logic
2021-04-07 15:15 ` [dpdk-stable] [PATCH v5] " Honnappa Nagarahalli
2021-04-07 20:16 ` [dpdk-stable] [PATCH 1/2] " Luc Pelletier
@ 2021-04-07 20:16 ` Luc Pelletier
2021-04-08 14:20 ` Olivier Matz
` (2 more replies)
1 sibling, 3 replies; 24+ messages in thread
From: Luc Pelletier @ 2021-04-07 20:16 UTC (permalink / raw)
To: olivier.matz, jianfeng.tan, Honnappa.Nagarahalli
Cc: dev, Luc Pelletier, stable
The affinity of a control thread is set after it has been launched. If
setting the affinity fails, pthread_cancel is called followed by a call
to pthread_join, which can hang forever if the thread's start routine
doesn't call a pthread cancellation point.
This patch modifies the logic so that the control thread exits
gracefully if the affinity cannot be set successfully and removes the
call to pthread_cancel.
Fixes: 6383d26 ("eal: set name when creating a control thread")
Cc: olivier.matz@6wind.com
Cc: stable@dpdk.org
Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>
---
Hi Olivier,
Hi Honnappa,
As discussed, I've split the changes into 2 patches. This second commit
removes the pthread_cancel call which could result in a hang on join, if
the ctrl thread routine didn't call a cancellation point.
lib/librte_eal/common/eal_common_thread.c | 29 +++++++++++++----------
1 file changed, 17 insertions(+), 12 deletions(-)
diff --git a/lib/librte_eal/common/eal_common_thread.c b/lib/librte_eal/common/eal_common_thread.c
index 3347e91bf..03dbcd9e8 100644
--- a/lib/librte_eal/common/eal_common_thread.c
+++ b/lib/librte_eal/common/eal_common_thread.c
@@ -187,14 +187,18 @@ static void *ctrl_thread_init(void *arg)
eal_get_internal_configuration();
rte_cpuset_t *cpuset = &internal_conf->ctrl_cpuset;
struct rte_thread_ctrl_params *params = arg;
- void *(*start_routine)(void *) = params->start_routine;
+ void *(*start_routine)(void *);
void *routine_arg = params->arg;
__rte_thread_init(rte_lcore_id(), cpuset);
pthread_barrier_wait(¶ms->configured);
+ start_routine = params->start_routine;
ctrl_params_free(params);
+ if (start_routine == NULL)
+ return NULL;
+
return start_routine(routine_arg);
}
@@ -218,14 +222,12 @@ rte_ctrl_thread_create(pthread_t *thread, const char *name,
params->refcnt = 2;
ret = pthread_barrier_init(¶ms->configured, NULL, 2);
- if (ret != 0) {
- free(params);
- return -ret;
- }
+ if (ret != 0)
+ goto fail_no_barrier;
ret = pthread_create(thread, attr, ctrl_thread_init, (void *)params);
if (ret != 0)
- goto fail;
+ goto fail_with_barrier;
if (name != NULL) {
ret = rte_thread_setname(*thread, name);
@@ -236,19 +238,22 @@ rte_ctrl_thread_create(pthread_t *thread, const char *name,
ret = pthread_setaffinity_np(*thread, sizeof(*cpuset), cpuset);
if (ret != 0)
- goto fail_cancel;
+ params->start_routine = NULL;
pthread_barrier_wait(¶ms->configured);
ctrl_params_free(params);
- return 0;
+ if (ret != 0)
+ /* start_routine has been set to NULL above; */
+ /* ctrl thread will exit immediately */
+ pthread_join(*thread, NULL);
-fail_cancel:
- pthread_cancel(*thread);
- pthread_join(*thread, NULL);
+ return -ret;
-fail:
+fail_with_barrier:
pthread_barrier_destroy(¶ms->configured);
+
+fail_no_barrier:
free(params);
return -ret;
--
2.25.1
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [dpdk-stable] [PATCH 1/2] eal: fix race in ctrl thread creation
2021-04-07 20:16 ` [dpdk-stable] [PATCH 1/2] " Luc Pelletier
@ 2021-04-08 14:17 ` Olivier Matz
2021-04-08 17:06 ` Honnappa Nagarahalli
1 sibling, 0 replies; 24+ messages in thread
From: Olivier Matz @ 2021-04-08 14:17 UTC (permalink / raw)
To: Luc Pelletier; +Cc: jianfeng.tan, Honnappa.Nagarahalli, dev, stable
On Wed, Apr 07, 2021 at 04:16:04PM -0400, Luc Pelletier wrote:
> The creation of control threads uses a pthread barrier for
> synchronization. This patch fixes a race condition where the pthread
> barrier could get destroyed while one of the threads has not yet
> returned from the pthread_barrier_wait function, which could result in
> undefined behaviour.
>
> Fixes: 3a0d465 ("eal: fix use-after-free on control thread creation")
> Cc: jianfeng.tan@intel.com
> Cc: stable@dpdk.org
>
> Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [dpdk-stable] [PATCH 2/2] eal: fix hang in ctrl thread creation error logic
2021-04-07 20:16 ` [dpdk-stable] [PATCH 2/2] eal: fix hang in ctrl thread creation error logic Luc Pelletier
@ 2021-04-08 14:20 ` Olivier Matz
2021-04-08 18:01 ` Luc Pelletier
2021-04-08 17:07 ` [dpdk-stable] " Honnappa Nagarahalli
2021-04-09 14:34 ` David Marchand
2 siblings, 1 reply; 24+ messages in thread
From: Olivier Matz @ 2021-04-08 14:20 UTC (permalink / raw)
To: Luc Pelletier; +Cc: jianfeng.tan, Honnappa.Nagarahalli, dev, stable
Hi Luc,
On Wed, Apr 07, 2021 at 04:16:06PM -0400, Luc Pelletier wrote:
> The affinity of a control thread is set after it has been launched. If
> setting the affinity fails, pthread_cancel is called followed by a call
> to pthread_join, which can hang forever if the thread's start routine
> doesn't call a pthread cancellation point.
>
> This patch modifies the logic so that the control thread exits
> gracefully if the affinity cannot be set successfully and removes the
> call to pthread_cancel.
>
> Fixes: 6383d26 ("eal: set name when creating a control thread")
> Cc: olivier.matz@6wind.com
> Cc: stable@dpdk.org
>
> Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>
Thank you for these 2 fixes. Note the the title of your patches do not
contain the version (should have been v8?). I don't know how critical
it is for commiters.
Acked-by: Olivier Matz <olivier.matz@6wind.com>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [dpdk-stable] [PATCH 1/2] eal: fix race in ctrl thread creation
2021-04-07 20:16 ` [dpdk-stable] [PATCH 1/2] " Luc Pelletier
2021-04-08 14:17 ` Olivier Matz
@ 2021-04-08 17:06 ` Honnappa Nagarahalli
1 sibling, 0 replies; 24+ messages in thread
From: Honnappa Nagarahalli @ 2021-04-08 17:06 UTC (permalink / raw)
To: Luc Pelletier, olivier.matz, jianfeng.tan
Cc: dev, stable, nd, Honnappa Nagarahalli, nd
<snip>
>
> The creation of control threads uses a pthread barrier for synchronization.
> This patch fixes a race condition where the pthread barrier could get
> destroyed while one of the threads has not yet returned from the
> pthread_barrier_wait function, which could result in undefined behaviour.
>
> Fixes: 3a0d465 ("eal: fix use-after-free on control thread creation")
> Cc: jianfeng.tan@intel.com
> Cc: stable@dpdk.org
>
> Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>
Looks good.
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
<snip>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [dpdk-stable] [PATCH 2/2] eal: fix hang in ctrl thread creation error logic
2021-04-07 20:16 ` [dpdk-stable] [PATCH 2/2] eal: fix hang in ctrl thread creation error logic Luc Pelletier
2021-04-08 14:20 ` Olivier Matz
@ 2021-04-08 17:07 ` Honnappa Nagarahalli
2021-04-09 14:34 ` David Marchand
2 siblings, 0 replies; 24+ messages in thread
From: Honnappa Nagarahalli @ 2021-04-08 17:07 UTC (permalink / raw)
To: Luc Pelletier, olivier.matz, jianfeng.tan
Cc: dev, stable, nd, Honnappa Nagarahalli, nd
<snip>
>
> The affinity of a control thread is set after it has been launched. If setting the
> affinity fails, pthread_cancel is called followed by a call to pthread_join, which
> can hang forever if the thread's start routine doesn't call a pthread
> cancellation point.
>
> This patch modifies the logic so that the control thread exits gracefully if the
> affinity cannot be set successfully and removes the call to pthread_cancel.
>
> Fixes: 6383d26 ("eal: set name when creating a control thread")
> Cc: olivier.matz@6wind.com
> Cc: stable@dpdk.org
>
> Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>
Looks good.
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> ---
>
<snip>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [dpdk-stable] [PATCH 2/2] eal: fix hang in ctrl thread creation error logic
2021-04-08 14:20 ` Olivier Matz
@ 2021-04-08 18:01 ` Luc Pelletier
2021-04-09 8:13 ` [dpdk-stable] [dpdk-dev] " David Marchand
0 siblings, 1 reply; 24+ messages in thread
From: Luc Pelletier @ 2021-04-08 18:01 UTC (permalink / raw)
To: Olivier Matz; +Cc: jianfeng.tan, Honnappa.Nagarahalli, dev, stable
> Thank you for these 2 fixes. Note the the title of your patches do not
> contain the version (should have been v8?). I don't know how critical
> it is for commiters.
Thanks Olivier. I'll admit that I wasn't sure if I should version the
patches after splitting the original. I opted not to but it seems like
I should have. If it's a problem, please let me know and I'll repost
them with 'v8'.
Le jeu. 8 avr. 2021 à 10:20, Olivier Matz <olivier.matz@6wind.com> a écrit :
>
> Hi Luc,
>
> On Wed, Apr 07, 2021 at 04:16:06PM -0400, Luc Pelletier wrote:
> > The affinity of a control thread is set after it has been launched. If
> > setting the affinity fails, pthread_cancel is called followed by a call
> > to pthread_join, which can hang forever if the thread's start routine
> > doesn't call a pthread cancellation point.
> >
> > This patch modifies the logic so that the control thread exits
> > gracefully if the affinity cannot be set successfully and removes the
> > call to pthread_cancel.
> >
> > Fixes: 6383d26 ("eal: set name when creating a control thread")
> > Cc: olivier.matz@6wind.com
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>
>
> Thank you for these 2 fixes. Note the the title of your patches do not
> contain the version (should have been v8?). I don't know how critical
> it is for commiters.
>
> Acked-by: Olivier Matz <olivier.matz@6wind.com>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [dpdk-stable] [dpdk-dev] [PATCH 2/2] eal: fix hang in ctrl thread creation error logic
2021-04-08 18:01 ` Luc Pelletier
@ 2021-04-09 8:13 ` David Marchand
0 siblings, 0 replies; 24+ messages in thread
From: David Marchand @ 2021-04-09 8:13 UTC (permalink / raw)
To: Luc Pelletier
Cc: Olivier Matz, jianfeng.tan, Honnappa Nagarahalli, dev, dpdk stable
On Thu, Apr 8, 2021 at 8:02 PM Luc Pelletier <lucp.at.work@gmail.com> wrote:
>
> > Thank you for these 2 fixes. Note the the title of your patches do not
> > contain the version (should have been v8?). I don't know how critical
> > it is for commiters.
>
> Thanks Olivier. I'll admit that I wasn't sure if I should version the
> patches after splitting the original. I opted not to but it seems like
> I should have. If it's a problem, please let me know and I'll repost
> them with 'v8'.
I followed this series closely, so not an issue for me.
No need to resend.
I'll look at merging it today.
--
David Marchand
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [dpdk-stable] [PATCH 2/2] eal: fix hang in ctrl thread creation error logic
2021-04-07 20:16 ` [dpdk-stable] [PATCH 2/2] eal: fix hang in ctrl thread creation error logic Luc Pelletier
2021-04-08 14:20 ` Olivier Matz
2021-04-08 17:07 ` [dpdk-stable] " Honnappa Nagarahalli
@ 2021-04-09 14:34 ` David Marchand
2 siblings, 0 replies; 24+ messages in thread
From: David Marchand @ 2021-04-09 14:34 UTC (permalink / raw)
To: Luc Pelletier; +Cc: Olivier Matz, Honnappa Nagarahalli, dev, dpdk stable
On Wed, Apr 7, 2021 at 10:29 PM Luc Pelletier <lucp.at.work@gmail.com> wrote:
>
> The affinity of a control thread is set after it has been launched. If
> setting the affinity fails, pthread_cancel is called followed by a call
> to pthread_join, which can hang forever if the thread's start routine
> doesn't call a pthread cancellation point.
>
> This patch modifies the logic so that the control thread exits
> gracefully if the affinity cannot be set successfully and removes the
> call to pthread_cancel.
>
> Fixes: 6383d26 ("eal: set name when creating a control thread")
Fixed sha1's while applying.
We prefer sha1 on 12 chars, like described in
https://doc.dpdk.org/guides/contributing/patches.html#commit-messages-body.
> Cc: stable@dpdk.org
>
> Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Series applied, thanks for the fixes.
--
David Marchand
^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2021-04-09 14:34 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-24 13:04 [dpdk-stable] [PATCH] eal: fix possible UB on creation of ctrl thread Luc Pelletier
2021-03-25 11:27 ` [dpdk-stable] [PATCH v2] eal: fix race in ctrl thread creation Olivier Matz
2021-03-25 14:42 ` Luc Pelletier
2021-04-02 4:34 ` [dpdk-stable] [dpdk-dev] " Honnappa Nagarahalli
2021-04-06 16:15 ` [dpdk-stable] [PATCH v3] " Luc Pelletier
2021-04-06 21:10 ` Honnappa Nagarahalli
2021-04-07 12:35 ` [dpdk-stable] [PATCH v4] " Luc Pelletier
2021-04-07 12:53 ` [dpdk-stable] [PATCH v5] " Luc Pelletier
2021-04-07 13:22 ` Luc Pelletier
2021-04-07 13:31 ` Olivier Matz
2021-04-07 14:42 ` [dpdk-stable] [PATCH v6] " Luc Pelletier
2021-04-07 14:57 ` Olivier Matz
2021-04-07 15:29 ` [dpdk-stable] [PATCH v7] " Luc Pelletier
2021-04-07 17:15 ` Honnappa Nagarahalli
2021-04-07 15:15 ` [dpdk-stable] [PATCH v5] " Honnappa Nagarahalli
2021-04-07 20:16 ` [dpdk-stable] [PATCH 1/2] " Luc Pelletier
2021-04-08 14:17 ` Olivier Matz
2021-04-08 17:06 ` Honnappa Nagarahalli
2021-04-07 20:16 ` [dpdk-stable] [PATCH 2/2] eal: fix hang in ctrl thread creation error logic Luc Pelletier
2021-04-08 14:20 ` Olivier Matz
2021-04-08 18:01 ` Luc Pelletier
2021-04-09 8:13 ` [dpdk-stable] [dpdk-dev] " David Marchand
2021-04-08 17:07 ` [dpdk-stable] " Honnappa Nagarahalli
2021-04-09 14:34 ` David Marchand
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).