From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 3E68841EBD;
	Fri, 17 Mar 2023 15:49:33 +0100 (CET)
Received: from mails.dpdk.org (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id 1AF0F42F98;
	Fri, 17 Mar 2023 15:49:33 +0100 (CET)
Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182])
 by mails.dpdk.org (Postfix) with ESMTP id 6026840395;
 Fri, 17 Mar 2023 15:49:32 +0100 (CET)
Received: by linux.microsoft.com (Postfix, from userid 1086)
 id AABB420C343B; Fri, 17 Mar 2023 07:49:31 -0700 (PDT)
DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com AABB420C343B
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com;
 s=default; t=1679064571;
 bh=3fi5/T2vdpCu8lzfixq6DwvBVPIEtjBN7Zfs32bhD5Y=;
 h=Date:From:To:Cc:Subject:References:In-Reply-To:From;
 b=cLCQ7xBntagktEzCb+4apUWHkdyrpODJQz6Zz1IB0QE/el66eJhtx+NKgtTV1i4DH
 JushXaUp+Ha5u1HNetui/JILuyMzGBq46TaMxbLqqJji/nO6fVt0zgoV87eBOw1JvP
 0IN03wI41HvHFTKYJWS+AsUM8NpR9kJ/ntU69KgU=
Date: Fri, 17 Mar 2023 07:49:31 -0700
From: Tyler Retzlaff <roretzla@linux.microsoft.com>
To: David Marchand <david.marchand@redhat.com>
Cc: dev@dpdk.org, thomas@monjalon.net, stephen@networkplumber.org,
 stable@dpdk.org
Subject: Re: [PATCH v5 2/2] eal: fix failure path race setting new thread
 affinity
Message-ID: <20230317144931.GA29683@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net>
References: <1677782682-27200-1-git-send-email-roretzla@linux.microsoft.com>
 <1678925224-2706-1-git-send-email-roretzla@linux.microsoft.com>
 <1678925224-2706-3-git-send-email-roretzla@linux.microsoft.com>
 <CAJFAV8yk-APRVYO-v953DUc=YKx=54mMCRPsCXUa_y43vs1q9A@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <CAJFAV8yk-APRVYO-v953DUc=YKx=54mMCRPsCXUa_y43vs1q9A@mail.gmail.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org

On Fri, Mar 17, 2023 at 11:45:08AM +0100, David Marchand wrote:
> On Thu, Mar 16, 2023 at 1:07 AM Tyler Retzlaff
> <roretzla@linux.microsoft.com> wrote:
> >
> > In rte_thread_create setting affinity after pthread_create may fail.
> > Such a failure should result in the entire rte_thread_create failing
> > but doesn't.
> >
> > Additionally if there is a failure to set affinity a race exists where
> > the creating thread will free ctx and depending on scheduling of the new
> > thread it may also free ctx (double free).
> >
> > Resolve the above by setting the affinity from the newly created thread
> > using a condition variable to signal the completion of the thread
> > start wrapper having completed.
> >
> > Since we are now waiting for the thread start wrapper to complete we can
> > allocate the thread start wrapper context on the stack. While here clean
> > up the variable naming in the context to better highlight the fields of
> > the context require synchronization between the creating and created
> > thread.
> >
> > Fixes: ce6e911d20f6 ("eal: add thread lifetime API")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
> > ---
> >  lib/eal/unix/rte_thread.c | 70 +++++++++++++++++++++++++++++------------------
> >  1 file changed, 43 insertions(+), 27 deletions(-)
> >
> > diff --git a/lib/eal/unix/rte_thread.c b/lib/eal/unix/rte_thread.c
> > index 37ebfcf..5992b04 100644
> > --- a/lib/eal/unix/rte_thread.c
> > +++ b/lib/eal/unix/rte_thread.c
> > @@ -16,9 +16,14 @@ struct eal_tls_key {
> >         pthread_key_t thread_index;
> >  };
> >
> > -struct thread_routine_ctx {
> > +struct thread_start_context {
> >         rte_thread_func thread_func;
> > -       void *routine_args;
> > +       void *thread_args;
> > +       const rte_thread_attr_t *thread_attr;
> > +       pthread_mutex_t wrapper_mutex;
> > +       pthread_cond_t wrapper_cond;
> > +       int wrapper_ret;
> > +       volatile int wrapper_done;
> 
> One question.
> 
> I see that wrapper_done is accessed under wrapper_mutex.
> Is volatile needed?

I'm not entirely certain. i'm being cautious since i can conceive of the
load in the loop being optimized as a single load by the compiler. but
again i'm not sure, i always like to learn if someone knows better.

> 
> (nit: a boolean is probably enough too)

I have no issue with it being a _Bool if you want to adjust it for that
i certainly don't object. ordinarily i would use _Bool but a lot of dpdk
code seems to prefer int so that's why i chose it. if we use the macro
bool then we should include stdbool.h directly into this translation
unit.

> 
> I was thinking of squashing below diff:

Yeah, no objection. you can decide if you want to keep the volatile or
not and add the stdbool.h include.

Thanks for reviewing, appreciate it.

> 
> diff --git a/lib/eal/unix/rte_thread.c b/lib/eal/unix/rte_thread.c
> index 5992b04a45..5ab5267ca3 100644
> --- a/lib/eal/unix/rte_thread.c
> +++ b/lib/eal/unix/rte_thread.c
> @@ -23,7 +23,7 @@ struct thread_start_context {
>         pthread_mutex_t wrapper_mutex;
>         pthread_cond_t wrapper_cond;
>         int wrapper_ret;
> -       volatile int wrapper_done;
> +       bool wrapper_done;
>  };
> 
>  static int
> @@ -101,7 +101,7 @@ thread_start_wrapper(void *arg)
> 
>         pthread_mutex_lock(&ctx->wrapper_mutex);
>         ctx->wrapper_ret = ret;
> -       ctx->wrapper_done = 1;
> +       ctx->wrapper_done = true;
>         pthread_cond_signal(&ctx->wrapper_cond);
>         pthread_mutex_unlock(&ctx->wrapper_mutex);
> 
> @@ -127,6 +127,7 @@ rte_thread_create(rte_thread_t *thread_id,
>                 .thread_func = thread_func,
>                 .thread_args = args,
>                 .thread_attr = thread_attr,
> +               .wrapper_done = false,
>                 .wrapper_mutex = PTHREAD_MUTEX_INITIALIZER,
>                 .wrapper_cond = PTHREAD_COND_INITIALIZER,
>         };
> @@ -151,7 +152,6 @@ rte_thread_create(rte_thread_t *thread_id,
>                         goto cleanup;
>                 }
> 
> -
>                 if (thread_attr->priority ==
>                                 RTE_THREAD_PRIORITY_REALTIME_CRITICAL) {
>                         ret = ENOTSUP;
> @@ -183,7 +183,7 @@ rte_thread_create(rte_thread_t *thread_id,
>         }
> 
>         pthread_mutex_lock(&ctx.wrapper_mutex);
> -       while (ctx.wrapper_done != 1)
> +       while (!ctx.wrapper_done)
>                 pthread_cond_wait(&ctx.wrapper_cond, &ctx.wrapper_mutex);
>         ret = ctx.wrapper_ret;
>         pthread_mutex_unlock(&ctx.wrapper_mutex);
> 
> 
> The rest lgtmn thanks Tyler.
> 
> 
> 
> -- 
> David Marchand