From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by dpdk.org (Postfix) with ESMTP id E951BAAD1 for ; Fri, 27 Apr 2018 18:46:53 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 27 Apr 2018 09:46:49 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.49,335,1520924400"; d="scan'208";a="36682631" Received: from aburakov-mobl.ger.corp.intel.com (HELO [10.252.25.158]) ([10.252.25.158]) by orsmga007.jf.intel.com with ESMTP; 27 Apr 2018 09:46:48 -0700 To: "Tan, Jianfeng" , Olivier Matz , dev@dpdk.org References: <20180403130439.11151-1-olivier.matz@6wind.com> <20180424144651.13145-1-olivier.matz@6wind.com> <20180424144651.13145-4-olivier.matz@6wind.com> <6de0fd38-b674-3f90-7cd0-098e4ae0ee21@intel.com> <0e10a781-00d7-7003-c103-496a85e72584@intel.com> From: "Burakov, Anatoly" Message-ID: Date: Fri, 27 Apr 2018 17:46:47 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <0e10a781-00d7-7003-c103-496a85e72584@intel.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [dpdk-dev] [PATCH v3 3/5] eal: set name when creating a control thread X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Apr 2018 16:46:54 -0000 On 27-Apr-18 5:17 PM, Tan, Jianfeng wrote: > > > On 4/27/2018 11:46 PM, Tan, Jianfeng wrote: >> Hi Olivier, >> >> After this patch, I find the two IPC threads block at >> pthread_barrier_wait(), and never wake up. Please refer below for more >> information. The system is Ubuntu 16.04. >> >> On 4/24/2018 10:46 PM, Olivier Matz wrote: >>> To avoid code duplication, add a parameter to rte_ctrl_thread_create() >>> to specify the name of the thread. >>> >>> This requires to add a wrapper for the thread start routine in >>> rte_thread_init(), which will first wait that the thread is configured. >>> >>> Signed-off-by: Olivier Matz >>> --- >>>   drivers/net/kni/rte_eth_kni.c                |  3 +- >>>   lib/librte_eal/common/eal_common_proc.c      | 15 +++----- >>>   lib/librte_eal/common/eal_common_thread.c    | 52 >>> +++++++++++++++++++++++++--- >>>   lib/librte_eal/common/include/rte_lcore.h    |  7 ++-- >>>   lib/librte_eal/linuxapp/eal/eal_interrupts.c | 13 ++----- >>>   lib/librte_eal/linuxapp/eal/eal_timer.c      | 12 +------ >>>   lib/librte_vhost/socket.c                    | 25 +++---------- >>>   7 files changed, 66 insertions(+), 61 deletions(-) >> [...] >>> diff --git a/lib/librte_eal/common/eal_common_thread.c >>> b/lib/librte_eal/common/eal_common_thread.c >>> index efbccddbc..94d2a6e42 100644 >>> --- a/lib/librte_eal/common/eal_common_thread.c >>> +++ b/lib/librte_eal/common/eal_common_thread.c >>> @@ -7,6 +7,7 @@ >>>   #include >>>   #include >>>   #include >>> +#include >>>   #include >>>   #include >>>   #include >>> @@ -141,10 +142,53 @@ eal_thread_dump_affinity(char *str, unsigned size) >>>       return ret; >>>   } >>>   + >>> +struct rte_thread_ctrl_params { >>> +    void *(*start_routine)(void *); >>> +    void *arg; >>> +    pthread_barrier_t configured; >>> +}; >>> + >>> +static void *rte_thread_init(void *arg) >>> +{ >>> +    struct rte_thread_ctrl_params *params = arg; >>> +    void *(*start_routine)(void *) = params->start_routine; >>> +    void *routine_arg = params->arg; >>> + >>> +    pthread_barrier_wait(¶ms->configured); >> >> This thread never wakes up. The call trace as below: >> >> #0  0x00007ffff72a8154 in futex_wait (private=0, expected=0, >> futex_word=0x7fffffffcff4) >>     at ../sysdeps/unix/sysv/linux/futex-internal.h:61 >> #1  futex_wait_simple (private=0, expected=0, >> futex_word=0x7fffffffcff4) at ../sysdeps/nptl/futex-internal.h:135 >> #2  __pthread_barrier_wait (barrier=0x7fffffffcff0) at >> pthread_barrier_wait.c:184 >> #3  0x000000000055216a in rte_thread_init (arg=0x7fffffffcfe0) at >> /home/tan/git/dpdk/lib/librte_eal/common/eal_common_thread.c:160 >> #4  0x00007ffff72a16ba in start_thread (arg=0x7ffff6ecf700) at >> pthread_create.c:333 >> #5  0x00007ffff6fd741d in clone () at >> ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 >> >>> + >>> +    return start_routine(routine_arg); >>> +} >>> + >>>   __rte_experimental int >>> -rte_ctrl_thread_create(pthread_t *thread, >>> -            const pthread_attr_t *attr, >>> -            void *(*start_routine)(void *), void *arg) >>> +rte_ctrl_thread_create(pthread_t *thread, const char *name, >>> +        const pthread_attr_t *attr, >>> +        void *(*start_routine)(void *), void *arg) >>>   { >>> -    return pthread_create(thread, attr, start_routine, arg); >>> +    struct rte_thread_ctrl_params params = { >>> +        .start_routine = start_routine, >>> +        .arg = arg, >>> +    }; > > Update: > > I doubt it's due to that we defined this variable, params, on the stack; > and the value seems be overwritten by following code. Will send a patch > to fix it. I'm not sure i follow you, but looking forward to the fix :) As far as i can tell, even if the variable is on the stack, we're making copies of values there before destroying them, so even if param somehow got destroyed before the thread had a chance to start, we've already got all data we needed from it. I can't see how that value being allocated on the stack makes a difference. Just about the only thing i can see that's slightly wrong here is lack of pthread_barrier_destroy(). Perhaps add that as well? :) > > Thanks, > Jianfeng > > >>> +    int ret; >>> + >>> +    pthread_barrier_init(¶ms.configured, NULL, 2); >>> + >>> +    ret = pthread_create(thread, attr, rte_thread_init, (void >>> *)¶ms); >>> +    if (ret != 0) >>> +        return ret; >>> + >>> +    if (name != NULL) { >>> +        ret = rte_thread_setname(*thread, name); >>> +        if (ret < 0) >>> +            goto fail; >>> +    } >>> + >>> +    pthread_barrier_wait(¶ms.configured); >> >> Here, the thread wakes up normally, and continues. >> >> Any idea on what's going on? >> >> Thanks, >> Jianfeng >> >>> + >>> +    return 0; >>> + >>> +fail: >>> +    pthread_cancel(*thread); >>> +    pthread_join(*thread, NULL); >>> +    return ret; >>>   } >> > > -- Thanks, Anatoly