From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf0-f195.google.com (mail-pf0-f195.google.com [209.85.192.195]) by dpdk.org (Postfix) with ESMTP id 3745B5F2A for ; Sat, 28 Apr 2018 03:21:45 +0200 (CEST) Received: by mail-pf0-f195.google.com with SMTP id a14so2681843pfi.1 for ; Fri, 27 Apr 2018 18:21:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=FO747wv9P8l842VrBN5GBdKOGdqZ/v9p1k/tp0Rj/Xk=; b=VExFB9Y9AfL8aFkxwEjTE21Wv8wTN0RwaBKsVo0SXA9zn8gnRIIzExm3kTwp6/BynT ESo9T+11EODh4nXxaI5e7Ks+cPRjzK/t6/GrFx0m7e4bSHFsM8cnEYKqzQISdc8i5WvW F8KnyS1UfUmBWhilLUYARIVEvnKu6mhcl5wrW+KcGwVPjbB4Lq+q4JNahQX48fEeJtEM 424z0JoIOeA/aI3Fxj81fwCXylW4BmzSgXwbKenCOnd07pFWFY0rij6nFhoEEgWNMpwy o25qi3oCUKm4gSI0xIaUoUAi1TXuLdejHA/tB9OPLMQmepxf0icfGWSb7b74Dfiqma4W 9oFA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=FO747wv9P8l842VrBN5GBdKOGdqZ/v9p1k/tp0Rj/Xk=; b=pWyh90CzXTbKz5QMJiA3xG8KuWZY+B2FxlBppMJo1BeOH2CanP80U//rAuq7L6uzE1 Doi2KDPE/KrmhatkLLY5MzOXp4Hx+j42BEP7EzqT/0OgfcApJX1OXu3PxzA0zIyr2LEM gp+jLqUahp4V1yZsUTzdsAS7+rVRlTh7suicPT2pubQNo6kDeZ/AWsNI7b6CnRsZ17pq hs/e8rZTBUzc2MdoRpOX9MjfTiLujD7J9sOsLsvVZZk99K+uMCvLBnX9PKMVZlaqEmhq bVZyRkdfmhR6Xkk7+oTPGSihbnRr+n+noZ/BXpbImGEuHMrVTUu1uKHQlIUir40Sws+/ rMww== X-Gm-Message-State: ALQs6tAOYlR+c7rBR/55x3ovviNaLlb+IHQHwjWJ2pyJjGGqO7NdD4GE THtZ9H9h4ut5k9zmaKofjt9b0A== X-Google-Smtp-Source: AB8JxZo1mjk5K+t1cLLPK0I6DNaz582gL0ibUjNWhSiH0sNKE9ziAxmtGeuCLHajTu0P49SnSXZI5w== X-Received: by 2002:a63:7253:: with SMTP id c19-v6mr3833952pgn.425.1524878504333; Fri, 27 Apr 2018 18:21:44 -0700 (PDT) Received: from xeon-e3 (204-195-71-95.wavecable.com. [204.195.71.95]) by smtp.gmail.com with ESMTPSA id x10sm5086300pfd.162.2018.04.27.18.21.43 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 27 Apr 2018 18:21:44 -0700 (PDT) Date: Fri, 27 Apr 2018 18:21:41 -0700 From: Stephen Hemminger To: Thomas Monjalon Cc: Shreyansh Jain , Jianfeng Tan , dev@dpdk.org, Olivier Matz , Anatoly Burakov Message-ID: <20180427182141.227af689@xeon-e3> In-Reply-To: <13763738.ezdo4hZiut@xps> References: <1524847302-88110-1-git-send-email-jianfeng.tan@intel.com> <20180427103945.511a118e@xeon-e3> <13763738.ezdo4hZiut@xps> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH] eal: fix threads block on barrier X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Apr 2018 01:21:45 -0000 On Fri, 27 Apr 2018 21:52:26 +0200 Thomas Monjalon wrote: > 27/04/2018 19:45, Shreyansh Jain: > > From: Stephen Hemminger [mailto:stephen@networkplumber.org] > > > Shreyansh Jain wrote: > > > > From: Jianfeng Tan > > > > > Below commit introduced pthread barrier for synchronization. > > > > > But two IPC threads block on the barrier, and never wake up. > > > > > > > > > > (gdb) bt > > > > > #0 futex_wait (private=0, expected=0, futex_word=0x7fffffffcff4) > > > > > at ../sysdeps/unix/sysv/linux/futex-internal.h:61 > > > > > #1 futex_wait_simple (private=0, expected=0, > > > > > futex_word=0x7fffffffcff4) > > > > > at ../sysdeps/nptl/futex-internal.h:135 > > > > > #2 __pthread_barrier_wait (barrier=0x7fffffffcff0) at > > > > > pthread_barrier_wait.c:184 > > > > > #3 rte_thread_init (arg=0x7fffffffcfe0) > > > > > at ../dpdk/lib/librte_eal/common/eal_common_thread.c:160 > > > > > #4 start_thread (arg=0x7ffff6ecf700) at pthread_create.c:333 > > > > > #5 clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 > > > > > > > > > > Through analysis, we find the barrier defined on the stack > > > > > could be the root cause. This patch will change to use heap > > > > > memory as the barrier. > > > > > > > > > > Fixes: d651ee4919cd ("eal: set affinity for control threads") > > > > > > > > > > Cc: Olivier Matz > > > > > Cc: Anatoly Burakov > > > > > > > > > > Signed-off-by: Jianfeng Tan > > > > > > > > Though I have seen Stephen's comment on this (possibly a library > > > bug), this at least fixes an issue which was dogging dpaa and dpaa2 - > > > generating bus errors and futex errors with variation in core masks > > > provided to applications. > > > > > > > > Thanks a lot for this. > > > > > > > > Acked-by: Shreyansh Jain > > Applied, thanks Jianfeng. > > > > Could you verify there is not a use after free by using valgrind or > > > some library that poisons memory on free. > > > > I will probably do that soon - but for the time being I don't want > > this issue to block the dpaa/dpaa2 for RC1 - these drivers were > > completely unusable without this patch. > > Please Shreyansh, continue the analysis of this bug. > Thanks > > I think the patch needs to change. The attributes need be either global (or leak and never free). The glibc source for init keeps the pointer to the attributes. static const struct pthread_barrierattr default_barrierattr = { .pshared = PTHREAD_PROCESS_PRIVATE }; int __pthread_barrier_init (pthread_barrier_t *barrier, const pthread_barrierattr_t *attr, unsigned int count) { struct pthread_barrier *ibarrier; /* XXX EINVAL is not specified by POSIX as a possible error code for COUNT being too large. See pthread_barrier_wait for the reason for the comparison with BARRIER_IN_THRESHOLD. */ if (__glibc_unlikely (count == 0 || count >= BARRIER_IN_THRESHOLD)) return EINVAL; const struct pthread_barrierattr *iattr = (attr != NULL ? (struct pthread_barrierattr *) attr : &default_barrierattr); ibarrier = (struct pthread_barrier *) barrier; /* Initialize the individual fields. */ ibarrier->in = 0; ibarrier->out = 0; ibarrier->count = count; ibarrier->current_round = 0; ibarrier->shared = (iattr->pshared == PTHREAD_PROCESS_PRIVATE ? FUTEX_PRIVATE : FUTEX_SHARED); return 0; } weak_alias (__pthread_barrier_init, pthread_barrier_init)