From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by dpdk.org (Postfix) with ESMTP id AE13616E for ; Wed, 2 May 2018 10:34:40 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 3B392722F4; Wed, 2 May 2018 08:34:40 +0000 (UTC) Received: from [10.36.112.54] (ovpn-112-54.ams2.redhat.com [10.36.112.54]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 01A772026DFD; Wed, 2 May 2018 08:34:38 +0000 (UTC) To: "Tan, Jianfeng" , Olivier Matz , dev@dpdk.org Cc: Anatoly Burakov , Thomas Monjalon References: <20180403130439.11151-1-olivier.matz@6wind.com> <20180424144651.13145-1-olivier.matz@6wind.com> <4256B2F0-EF9D-4B22-AC1A-D440C002360A@6wind.com> From: Maxime Coquelin Message-ID: <39d5baf8-2bad-6df8-0419-a06c65d41475@redhat.com> Date: Wed, 2 May 2018 10:34:37 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.78 on 10.11.54.4 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.2]); Wed, 02 May 2018 08:34:40 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.2]); Wed, 02 May 2018 08:34:40 +0000 (UTC) for IP:'10.11.54.4' DOMAIN:'int-mx04.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'maxime.coquelin@redhat.com' RCPT:'' Subject: Re: [dpdk-dev] pthread_barrier_deadlock in -rc1 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 May 2018 08:34:41 -0000 Hi Jiangfeng, On 05/02/2018 10:19 AM, Tan, Jianfeng wrote: > > > On 5/1/2018 2:46 AM, Olivier Matz wrote: >> Hi Maxime, >> >> Le 30 avril 2018 17:45:52 GMT+02:00, Maxime Coquelin >> a écrit : >>> Hi Olivier, >>> >>> On 04/24/2018 04:46 PM, Olivier Matz wrote: >>>> Some parts of dpdk use their own management threads. Most of the >>> time, >>>> the affinity of the thread is not properly set: it should not be >>> scheduled >>>> on the dataplane cores, because interrupting them can cause packet >>> losses. >>>> This patchset introduces a new wrapper for thread creation that does >>>> the job automatically, avoiding code duplication. >>>> >>>> v3: >>>> * new patch: use this API in examples when relevant. >>>> * replace pthread_kill by pthread_cancel. Note that pthread_join() >>>>     is still needed. >>>> * rebase: vfio and pdump do not have control pthreads anymore, and >>> eal >>>>     has 2 new pthreads >>>> * remove all calls to snprintf/strlcpy that truncate the thread name: >>>>     all strings lengths are already < 16. >>>> >>>> v2: >>>> * set affinity to master core if no core is off, as suggested by >>>>     Anatoly >>>> >>>> Olivier Matz (5): >>>>     eal: use sizeof to avoid a double use of a define >>>>     eal: new function to create control threads >>>>     eal: set name when creating a control thread >>>>     eal: set affinity for control threads >>>>     examples: use new API to create control threads >>>> >>>>    drivers/net/kni/Makefile                     |  1 + >>>>    drivers/net/kni/rte_eth_kni.c                |  3 +- >>>>    examples/tep_termination/main.c              | 16 +++---- >>>>    examples/vhost/main.c                        | 19 +++----- >>>>    lib/librte_eal/bsdapp/eal/eal.c              |  4 +- >>>>    lib/librte_eal/bsdapp/eal/eal_thread.c       |  2 +- >>>>    lib/librte_eal/common/eal_common_proc.c      | 15 ++---- >>>>    lib/librte_eal/common/eal_common_thread.c    | 72 >>> ++++++++++++++++++++++++++++ >>>>    lib/librte_eal/common/include/rte_lcore.h    | 26 ++++++++++ >>>>    lib/librte_eal/linuxapp/eal/eal.c            |  4 +- >>>>    lib/librte_eal/linuxapp/eal/eal_interrupts.c | 17 ++----- >>>>    lib/librte_eal/linuxapp/eal/eal_thread.c     |  2 +- >>>>    lib/librte_eal/linuxapp/eal/eal_timer.c      | 12 +---- >>>>    lib/librte_eal/rte_eal_version.map           |  1 + >>>>    lib/librte_vhost/socket.c                    | 25 ++-------- >>>>    15 files changed, 135 insertions(+), 84 deletions(-) >>>> >>> I face a deadlock issue with your series, that Jianfeng patch does not >>> resolve ("eal: fix threads block on barrier"). Reverting the series and >>> Jianfeng patch makes the issue to disappear. >>> >>> I face the problem in a VM (not seen on the host): >>> # ./install/bin/testpmd -l 0,1,2 --socket-mem 1024 -n 4 --proc-type >>> auto >>> --file-prefix pg -- --portmask=3 --forward-mode=macswap >>> --port-topology=chained --disable-rss -i --rxq=1 --txq=1 --rxd=256 >>> --txd=256 --nb-cores=2 --auto-start >>> EAL: Detected 3 lcore(s) >>> EAL: Detected 1 NUMA nodes >>> EAL: Auto-detected process type: PRIMARY >>> EAL: Multi-process socket /var/run/.pg_unix >>> >>> >>> Then it is stuck. Attaching with GDB, I get below backtrace >>> information: >>> >>> (gdb) info threads >>>    Id   Target Id         Frame >>>    3    Thread 0x7f63e1f9f700 (LWP 8808) "rte_mp_handle" >>> 0x00007f63e2591bfd in recvmsg () at >>> ../sysdeps/unix/syscall-template.S:81 >>>    2    Thread 0x7f63e179e700 (LWP 8809) "rte_mp_async" >>> pthread_barrier_wait () at >>> ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_barrier_wait.S:71 >>> * 1    Thread 0x7f63e32cec00 (LWP 8807) "testpmd" pthread_barrier_wait >>> () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_barrier_wait.S:71 >>> (gdb) bt full >>> #0  pthread_barrier_wait () at >>> ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_barrier_wait.S:71 >>> No locals. >>> #1  0x0000000000520c54 in rte_ctrl_thread_create >>> (thread=thread@entry=0x7ffe5c895020, name=name@entry=0x869d86 >>> "rte_mp_async", attr=attr@entry=0x0, >>> start_routine=start_routine@entry=0x521030 , >>> arg=arg@entry=0x0) >>>      at /root/src/dpdk/lib/librte_eal/common/eal_common_thread.c:207 >>>          params = 0x17b1e40 >>>          lcore_id = >>>          cpuset = {__bits = {1, 0 }} >>>          cpu_found = >>>          ret = 0 >>> #2  0x00000000005220b6 in rte_mp_channel_init () at >>> /root/src/dpdk/lib/librte_eal/common/eal_common_proc.c:674 >>>         path = "/var/run\000.pg_unix_*", '\000' ... >>>          dir_fd = 4 >>>          mp_handle_tid = 140066969745152 >>>          async_reply_handle_tid = 140066961352448 >>> #3  0x000000000050c227 in rte_eal_init (argc=argc@entry=23, >>> argv=argv@entry=0x7ffe5c896378) at >>> /root/src/dpdk/lib/librte_eal/linuxapp/eal/eal.c:775 >>>          i = >>>          fctret = 11 >>>          ret = >>>          thread_id = 140066989861888 >>>          run_once = {cnt = 1} >>>          logid = 0x17b1e00 "testpmd" >>>          cpuset = "T}\211\\\376\177", '\000' , >>> "\020", '\000' ... >>>       thread_name = "X}\211\\\376\177\000\000\226\301\036\342c\177\000" >>>          __func__ = "rte_eal_init" >>> #4  0x0000000000473214 in main (argc=23, argv=0x7ffe5c896378) at >>> /root/src/dpdk/app/test-pmd/testpmd.c:2597 >>>          diag = >>>          port_id = >>>          ret = >>>          __func__ = "main" >>> (gdb) thread 2 >>> [Switching to thread 2 (Thread 0x7f63e179e700 (LWP 8809))] >>> #0  pthread_barrier_wait () at >>> ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_barrier_wait.S:71 >>> 71        cmpl    %edx, (%rdi) >>> (gdb) bt full >>> #0  pthread_barrier_wait () at >>> ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_barrier_wait.S:71 >>> No locals. >>> #1  0x0000000000520777 in rte_thread_init (arg=) at >>> /root/src/dpdk/lib/librte_eal/common/eal_common_thread.c:156 >>>          params = >>>          start_routine = 0x521030 >>>          routine_arg = 0x0 >>> #2  0x00007f63e258add5 in start_thread (arg=0x7f63e179e700) at >>> pthread_create.c:308 >>>          __res = >>>          pd = 0x7f63e179e700 >>>          now = >>>          unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140066961352448, >>> 1212869169857371576, 0, 8392704, 0, 140066961352448, >>> -1291626103561052744, -1291619793368703560}, mask_was_saved = 0}}, priv >>> >>> = {pad = {0x0, 0x0, 0x0, 0x0}, data = { >>>                prev = 0x0, cleanup = 0x0, canceltype = 0}}} >>>          not_first_call = >>>          pagesize_m1 = >>>          sp = >>>          freesize = >>> #3  0x00007f63e22b4b3d in clone () at >>> ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 >>> No locals. >>> (gdb) thread 3 >>> [Switching to thread 3 (Thread 0x7f63e1f9f700 (LWP 8808))] >>> #0  0x00007f63e2591bfd in recvmsg () at >>> ../sysdeps/unix/syscall-template.S:81 >>> 81    T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS) >>> (gdb) bt full >>> #0  0x00007f63e2591bfd in recvmsg () at >>> ../sysdeps/unix/syscall-template.S:81 >>> No locals. >>> #1  0x000000000052194e in read_msg (s=0x7f63e1f9d3b0, m=0x7f63e1f9d5a0) >>> >>> at /root/src/dpdk/lib/librte_eal/common/eal_common_proc.c:258 >>>          msglen = >>>          control = >>> "\000\000\000\000\000\000\000\000\336~\f\343c\177\000\000\005", '\000' >>> , "\360\371\033\342c\177\000" >>>          cmsg = >>>          iov = {iov_base = 0x7f63e1f9d5a0, iov_len = 332} >>>        msgh = {msg_name = 0x7f63e1f9d3b0, msg_namelen = 110, msg_iov = >>> 0x7f63e1f9d370, msg_iovlen = 1, msg_control = 0x7f63e1f9d380, >>> msg_controllen = 48, msg_flags = 0} >>> #2  mp_handle (arg=) at >>> /root/src/dpdk/lib/librte_eal/common/eal_common_proc.c:346 >>>          msg = {type = 0, msg = {name = '\000' , >>> len_param = 0, num_fds = 0, param = '\000' , "\002", >>> '\000' , fds = {0, 0, 0, 0, 0, 0, 0, 0}}} >>>          sa = {sun_family = 55104, >>>            sun_path = >>> "\371\341c\177\000\000\352\372\f\343c\177\000\000\000\000\000\000\000\000\000\000\377\377\377\377\377\377\377\377\000\367\371\341c\177\000\000\030\000\000\000\000\000\000\000p\327\371\341c\177\000\000\000\367\371\341c\177\000\000\000\367\371\341c\177", >>> >>> >>> '\000' , "\200\037\000\000\377\377"} >>> #3  0x00007f63e258add5 in start_thread (arg=0x7f63e1f9f700) at >>> pthread_create.c:308 >>>          __res = >>>          pd = 0x7f63e1f9f700 >>>          now = >>>          unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140066969745152, >>> 1212869169857371576, 0, 8392704, 0, 140066969745152, >>> -1291625004586295880, -1291619793368703560}, mask_was_saved = 0}}, priv >>> >>> = {pad = {0x0, 0x0, 0x0, 0x0}, data = { >>>                prev = 0x0, cleanup = 0x0, canceltype = 0}}} >>>          not_first_call = >>>          pagesize_m1 = >>>          sp = >>>          freesize = >>> #4  0x00007f63e22b4b3d in clone () at >>> ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 >>> No locals. >>> >>> I don't have more info for now. >>> >> >> Thanks for the feedback on this issue. I don't see obvious reason for >> this deadlock yet. >> >> I'll investigate it asap (not tomorrow, but wednesday). In the worst >> case, we can revert the series if I cannot find the root cause rapidly. > > I might think that the suggestion from Stephen of destroying the barrier > can help this issue. I'll try to reproduce it and test it before sending > a patch to fix it. In case you don't reproduce, feel free to send me the patch to test it. Thanks, Maxime > Thanks, > Jianfeng > >> >> Olivier >> >