From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 5F85741E23; Thu, 9 Mar 2023 22:05:58 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 4614040ED7; Thu, 9 Mar 2023 22:05:58 +0100 (CET) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mails.dpdk.org (Postfix) with ESMTP id 3636F410F3 for ; Thu, 9 Mar 2023 22:05:56 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678395955; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=i6BIlfMLOZUSNbdbvcs8KsxMjZbjYaob0LoDTQAyhHc=; b=Bm2woKVPIyhYqeqRwK8b96WjfRAjyfnptX7aFafPRgGpwEhpFur5triZgiMOzuh8pzlTaf Iya4aljSgG6fgWmkbiecqw0YRzuNkr5wcERsN8NUAldieVoRyv9zfOg3yITfeO9xC4cziR VA/IuAthWhZsXGf1t4NW+8CXEejl6wc= Received: from mail-pl1-f200.google.com (mail-pl1-f200.google.com [209.85.214.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-451-Kk6OVbBQOyqCodKjLext6Q-1; Thu, 09 Mar 2023 16:05:52 -0500 X-MC-Unique: Kk6OVbBQOyqCodKjLext6Q-1 Received: by mail-pl1-f200.google.com with SMTP id l10-20020a17090270ca00b0019caa6e6bd1so1656270plt.2 for ; Thu, 09 Mar 2023 13:05:52 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678395951; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=i6BIlfMLOZUSNbdbvcs8KsxMjZbjYaob0LoDTQAyhHc=; b=5QFLeLVydcoEQJtDDlKbCoXubOnQZUkG0ZFba11AH727tRcbxWQs9muLgAiNB/IcTs IgFuvoWHrZPl0CU5VEttzhCtL2GwP9WsR3SYcwJYEuy/8qwtOoHzwiQdQaGFvo9SMmC9 sVrxAVbcb720WHL9tZn6cxV50AYQZ7oTDS2hQkBazze35gPtaLAPW8tN9DixMAoarJND WahBXubhAC7w+oYUa/EzRd3Y8EOt2IF3CTyjPpmYKCTVN+Mx8tVCCCKoPPmbqTsjwsmZ PLx7u+Erlm/qBVkvNT3kCzbL50Q770K2ftAjkU+DrNS9d29FMooS/U1Z3AJs/nL1CThg ug0A== X-Gm-Message-State: AO0yUKW55Lwd+BKCU4Q9PGM5DSqGTAGQcttGsJCZs7QoIcqStGeCIVv0 tXOPHgF/nf+1fVhF9FoPqizO3yFi8vm5qfMQ4Mh55RKmBCK8k+EhI/MFa/jA25TxYXoW2r8GW/m F3dog6zPLJKRxHWBig4k= X-Received: by 2002:a17:903:3293:b0:199:1a40:dccc with SMTP id jh19-20020a170903329300b001991a40dcccmr8933823plb.9.1678395951336; Thu, 09 Mar 2023 13:05:51 -0800 (PST) X-Google-Smtp-Source: AK7set8LpsY9h+5N7JMyQg8LZl7ovpmkZslJ9u9fSoSPKFwI/gxN0YmPAlGNkst5aMgdMEo+fWIMO/GyzBKC3Gf7v4I= X-Received: by 2002:a17:903:3293:b0:199:1a40:dccc with SMTP id jh19-20020a170903329300b001991a40dcccmr8933820plb.9.1678395951022; Thu, 09 Mar 2023 13:05:51 -0800 (PST) MIME-Version: 1.0 References: <1677782682-27200-1-git-send-email-roretzla@linux.microsoft.com> <3722941.kQq0lBPeGt@thomas> <20230309204935.GA32415@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> In-Reply-To: <20230309204935.GA32415@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> From: David Marchand Date: Thu, 9 Mar 2023 22:05:39 +0100 Message-ID: Subject: Re: [PATCH 1/2] eal: fix failure race and behavior of thread create To: Tyler Retzlaff Cc: Thomas Monjalon , dev@dpdk.org, stable@dpdk.org X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Thu, Mar 9, 2023 at 9:49=E2=80=AFPM Tyler Retzlaff wrote: > > On Thu, Mar 09, 2023 at 10:58:06AM +0100, Thomas Monjalon wrote: > > 09/03/2023 10:17, David Marchand: > > > On Tue, Mar 7, 2023 at 3:33=E2=80=AFPM David Marchand wrote: > > > > On Thu, Mar 2, 2023 at 7:44=E2=80=AFPM Tyler Retzlaff > > > > wrote: > > > > > > > > > > In rte_thread_create setting affinity after pthread_create may fa= il. > > > > > Such a failure should result in the entire rte_thread_create fail= ing > > > > > but doesn't. > > > > > > > > > > Additionally if there is a failure to set affinity a race exists = where > > > > > the creating thread will free ctx and depending on scheduling of = the new > > > > > thread it may also free ctx (double free). > > > > > > > > > > Resolve both of the above issues by using the pthread_setaffinity= _np > > > > > prior to thread creation to set the affinity of the created threa= d. By > > > > > doing this no failure paths exist after pthread_create returns > > > > > successfully. > > > > > > > > > > Fixes: ce6e911d20f6 ("eal: add thread lifetime API") > > > > > Cc: stable@dpdk.org > > > > > Cc: roretzla@linux.microsoft.com > > > > > > > > > > Signed-off-by: Tyler Retzlaff > > > > Reviewed-by: David Marchand > > > > > > Series applied, thanks. > > > > Unfortunately we cannot merge this patch > > because it does not compile on Alpine Linux (musl libc): > > > > lib/eal/unix/rte_thread.c:160:31: error: > > implicit declaration of function 'pthread_attr_setaffinity_np' > > i didn't get any CI failure for this. did i just miss it? Count on me, I would have complained if there was a CI issue ;-). > > > > > Is it possible to fix the race without using pthread_attr_setaffinity_n= p? > > > > it seems we never allowed threads to be created with a set affinity when > using pthread_create directly (that was portable to alpine linux). for w= orker > threads the start_routine is setting the affinity from the new thread. > > certainly we can make this work by doing the same thing, but we'll have > to adjust the start routine wrapper to synchronize/wait for the new > thread to set the affinity and if it fails terminate the new thread > cleanly. > > i don't have a way to build for alpine linux or run the unit tests, does > someone want to make the above suggested adjustment? or i can try and > make a patch but someone else will have to carefully review and test. > > let me know how you'd like to proceed. UNH is looking into re-enabling the Alpine job. For the time being, if you have a github repository, I can propose a quick patch using GHA: https://github.com/david-marchand/dpdk/commit/ci I had tested compilation with a previous version of this patch. I just added running the unit tests (adding a checks: tests line in the job matrix), let's see how it goes... https://github.com/david-marchand/dpdk/actions/runs/4378715081/jobs/7663806= 639 --=20 David Marchand