From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by dpdk.org (Postfix) with ESMTP id 30D0323C for ; Wed, 2 May 2018 11:41:50 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id AE0A684257; Wed, 2 May 2018 09:41:49 +0000 (UTC) Received: from [10.36.112.54] (ovpn-112-54.ams2.redhat.com [10.36.112.54]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 8EC0363539; Wed, 2 May 2018 09:41:48 +0000 (UTC) To: "Tan, Jianfeng" , Olivier Matz Cc: dev@dpdk.org, Anatoly Burakov , Thomas Monjalon References: <20180403130439.11151-1-olivier.matz@6wind.com> <20180424144651.13145-1-olivier.matz@6wind.com> <4256B2F0-EF9D-4B22-AC1A-D440C002360A@6wind.com> <39d5baf8-2bad-6df8-0419-a06c65d41475@redhat.com> <2d828aa1-482f-7f19-1909-c3ca4599c9b2@intel.com> <393a2f7e-ed20-fa28-0b07-aa3374593d5a@redhat.com> <20180502092011.5nxl5nbka6zfi4hb@neon> <7afa9235-cc14-a05f-7f85-87d8a40d447e@intel.com> From: Maxime Coquelin Message-ID: Date: Wed, 2 May 2018 11:41:47 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <7afa9235-cc14-a05f-7f85-87d8a40d447e@intel.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.79 on 10.11.54.5 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.2]); Wed, 02 May 2018 09:41:49 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.2]); Wed, 02 May 2018 09:41:49 +0000 (UTC) for IP:'10.11.54.5' DOMAIN:'int-mx05.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'maxime.coquelin@redhat.com' RCPT:'' Subject: Re: [dpdk-dev] pthread_barrier_deadlock in -rc1 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 May 2018 09:41:50 -0000 On 05/02/2018 11:32 AM, Tan, Jianfeng wrote: > Hi Maxime and Olivier, > > [...] >>>> Below patch can fix another strange sigsegv issue in my VM. Please >>>> check >>>> if it works for you. I doubt it's use-after-free problem which could >>>> lead to different issues in different env. Please have a try. >>>> >>>> >>>> diff --git a/lib/librte_eal/common/eal_common_thread.c >>>> b/lib/librte_eal/common/eal_common_thread.c >>>> index de69452..d91b67d 100644 >>>> --- a/lib/librte_eal/common/eal_common_thread.c >>>> +++ b/lib/librte_eal/common/eal_common_thread.c >>>> @@ -205,6 +205,7 @@ rte_ctrl_thread_create(pthread_t *thread, const >>>> char >>>> *name, >>>>                   goto fail; >>>> >>>>           pthread_barrier_wait(¶ms->configured); >>>> +       pthread_barrier_destroy(¶ms->configured); >>> Thanks Jianfeng, that fixes my issue. >>> For correctness, I wonder whether we should check pthread_barrier_wait >>> return, and only call destroy() if PTHREAD_BARRIER_SERIAL_THREAD? >>> And so also do same the same thing in rte_thread_init(). >>> >>> What do you think? >>> Thanks, >>> Maxime >> >> Thanks for the update. I also have a patch that replaces the barrier by >> a lock which could also work, but if Jianfeng's one fixes the issue, I >> think it is better. >> >> About the PTHREAD_BARRIER_SERIAL_THREAD, not sure it will change >> something: >> >>         Upon successful completion, the pthread_barrier_wait() function >>         shall return PTHREAD_BARRIER_SERIAL_THREAD for a single >>         (arbitrary) thread synchronized at the barrier and zero for each >>         of the other threads. Otherwise, an error number shall be >>         returned to indicate the error. >> >> I understand that it will ensure that only one barrier will return >> PTHREAD_BARRIER_SERIAL_THREAD, but not necessarily the last one. So >> if destroy() is called in the parent thread, it should be the same, no? >> >> By the way, there is also a small memory leak that was introduced by >> the previous patch, maybe you can add the fix too: >> >> -       if (ret != 0) >> +       if (ret != 0) { >> +               free(params); >>                  return ret; >> +       } > > How about: the thread who gets PTHREAD_BARRIER_SERIAL_THREAD returned, > is responsible for the destroy and free(params)? I agree with your suggestion. Thanks, Maxime > Thanks, > Jianfeng