From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 7FE49A0577; Mon, 6 Apr 2020 06:10:16 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id AC16C2B86; Mon, 6 Apr 2020 06:10:15 +0200 (CEST) Received: from inbox.dpdk.org (xvm-172-178.dc0.ghst.net [95.142.172.178]) by dpdk.org (Postfix) with ESMTP id 814172B83 for ; Mon, 6 Apr 2020 06:10:14 +0200 (CEST) Received: by inbox.dpdk.org (Postfix, from userid 33) id 005D8A057B; Mon, 6 Apr 2020 06:10:13 +0200 (CEST) From: bugzilla@dpdk.org To: dev@dpdk.org Date: Mon, 06 Apr 2020 04:10:11 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: DPDK X-Bugzilla-Component: ethdev X-Bugzilla-Version: 19.11 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: mohsinshaikh@niometrics.com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: Normal X-Bugzilla-Assigned-To: dev@dpdk.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform op_sys bug_status bug_severity priority component assigned_to reporter target_milestone attachments.created Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://bugs.dpdk.org/ Auto-Submitted: auto-generated X-Auto-Response-Suppress: All MIME-Version: 1.0 Subject: [dpdk-dev] [Bug 440] net/mlx5: Read of "out_of_buffer" using fopen/fread/fclose causing TLB shootdowns due to mmap/munmap X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" https://bugs.dpdk.org/show_bug.cgi?id=3D440 Bug ID: 440 Summary: net/mlx5: Read of "out_of_buffer" using fopen/fread/fclose causing TLB shootdowns due to mmap/munmap Product: DPDK Version: 19.11 Hardware: x86 OS: Linux Status: UNCONFIRMED Severity: normal Priority: Normal Component: ethdev Assignee: dev@dpdk.org Reporter: mohsinshaikh@niometrics.com Target Milestone: --- Created attachment 86 --> https://bugs.dpdk.org/attachment.cgi?id=3D86&action=3Dedit Potential fix patch Setup: HPE ProLiant DL560 Gen10 Server with 4 x "Intel(R) Xeon(R) Platinum 8180M CPU @ 2.50GHz" and 6TB of memory (1 TB reserved huge pages) with 2 X "Mellanox Technologies ConnectX=C2=AE-5 EN network interface card, 100GbE single-port QSFP28, PCIe3.0 x16, tall bracket; MCX515A-CCAT". OS is Centos 7 with kernel 3.10.0-957.27.2.el7.x86_64.=20 While testing I noticed that the rx cores of the DPDK app were getting TLB shootdowns periodically. After debugging I found that the stats thread of t= he DPDK app was responsible for those. Specifically the read of "out_of_buffer" stat: static inline void mlx5_read_ib_stat(struct mlx5_priv *priv, const char *ctr_name, uint64_t *s= tat) { FILE *file; if (priv->sh) { MKSTR(path, "%s/ports/%d/hw_counters/%s", priv->sh->ibdev_path, priv->ibv_port, ctr_name); file =3D fopen(path, "rb"); if (file) {=20 int n =3D fscanf(file, "%" SCNu64, stat); <----- Ca= lls mmap(). fclose(file); <---- Calls munmap which leads to TLB shootdown if (n =3D=3D 1) return; } } *stat =3D 0; } NOTE: We almost exclusively use huge pages in the DPDK app for allocations. Stack traces: (gdb) where #0 __mmap (addr=3Daddr@entry=3D0x0, len=3D4096, prot=3Dprot@entry=3D3, flags=3Dflags@entry=3D34, fd=3Dfd@entry=3D-1, offset=3Doffset@entry=3D0) at ../sysdeps/unix/sysv/linux/wordsize-64/mmap.c:32 #1 0x00007fffdeaf8ed1 in __GI__IO_file_doallocate (fp=3D0x2b52263ae000) at filedoalloc.c:127 #2 0x00007fffdeb07d07 in __GI__IO_doallocbuf (fp=3Dfp@entry=3D0x2b52263ae0= 00) at genops.c:399 #3 0x00007fffdeb06c8c in _IO_new_file_underflow (fp=3D0x2b52263ae000) at fileops.c:557 #4 0x00007fffdeb07dd2 in __GI__IO_default_uflow (fp=3D0x2b52263ae000) at genops.c:414 #5 0x00007fffdeae7efa in _IO_vfscanf_internal (s=3D, format=3D, argptr=3Dargptr@entry=3D0x7fff51f07498, errp=3Derrp@entry=3D0x0) at vfscanf.c:600 #6 0x00007fffdeaed617 in ___vfscanf (s=3D, format=3D, argptr=3Dargptr@entry=3D0x7fff51f07498) at vfscanf.c:2942 #7 0x00007fffdeaf6247 in __fscanf (stream=3D, format=3D) at fscanf.c:31 #8 0x00007fffe632ee9f in mlx5_stats_get () from /opt/nio/dpdk/lib/librte_pmd_mlx5.so.20.0 #9 0x00007fffeee6b468 in rte_eth_stats_get () from /opt/nio/dpdk/lib/librte_ethdev.so.20.0 Breakpoint 2, munmap () at ../sysdeps/unix/syscall-template.S:81 81 T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS) (gdb) where #0 munmap () at ../sysdeps/unix/syscall-template.S:81 #1 0x00007fffdeb07cb2 in __GI__IO_setb (f=3Df@entry=3D0x2b52263ae000, b=3Db@entry=3D0x0, eb=3Deb@entry=3D0x0, a=3Da@entry=3D0) at genops.c:383 #2 0x00007fffdeb06090 in _IO_new_file_close_it (fp=3Dfp@entry=3D0x2b52263a= e000) at fileops.c:192 #3 0x00007fffdeaf9138 in _IO_new_fclose (fp=3D0x2b52263ae000) at iofclose.= c:58 #4 0x00007fffe632eeaa in mlx5_stats_get () from /opt/nio/dpdk/lib/librte_pmd_mlx5.so.20.0 #5 0x00007fffeee6b468 in rte_eth_stats_get () from /opt/nio/dpdk/lib/librte_ethdev.so.20.0 Output from kernel tracing (using trace-cmd): <...>-120694 [000] 502442.376421: function:=20=20=20=20=20=20=20= =20=20=20=20=20 native_flush_tlb_others <...>-120694 [000] 502442.376423: kernel_stack: =3D> tlb_flush_mmu.part.76 (ffffffffa89e57d7) =3D> tlb_finish_mmu (ffffffffa89e70d5) =3D> unmap_region (ffffffffa89f0754) =3D> do_munmap (ffffffffa89f2d45) =3D> vm_munmap (ffffffffa89f2f85) =3D> SyS_munmap (ffffffffa89f4212) =3D> tracesys (ffffffffa8f7706b) <...>-120324 [002] 502442.382250: function:=20=20=20=20=20=20=20= =20=20=20=20=20 flush_tlb_func <...>-120323 [001] 502442.382250: function:=20=20=20=20=20=20=20= =20=20=20=20=20 flush_tlb_func <...>-120325 [003] 502442.382251: function:=20=20=20=20=20=20=20= =20=20=20=20=20 flush_tlb_func <...>-120327 [005] 502442.382251: function:=20=20=20=20=20=20=20= =20=20=20=20=20 flush_tlb_func <...>-120326 [004] 502442.382251: function:=20=20=20=20=20=20=20= =20=20=20=20=20 flush_tlb_func <...>-120332 [010] 502442.382251: function:=20=20=20=20=20=20=20= =20=20=20=20=20 flush_tlb_func <...>-120324 [002] 502442.382252: kernel_stack: =3D> generic_smp_call_function_single_interrupt (ffffffffa8912ea3) =3D> smp_call_function_interrupt (ffffffffa885737d) =3D> call_function_interrupt (ffffffffa8f79382) The following patch seems to fix the issue: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D commit e4e93bd3ffb4e75b2178b0fdfe12341bfddf171d=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 Author: Mohsin Shaikh =20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 Date: Tue Mar 31 14:28:41 2020 +0800=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20 net/mlx5: Use open/read/close for reading ib stat=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 diff --git a/drivers/net/mlx5/mlx5_stats.c b/drivers/net/mlx5/mlx5_stats.c= =20=20=20=20=20=20 index 205e4fe..769f080 100644=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20 --- a/drivers/net/mlx5/mlx5_stats.c=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20 +++ b/drivers/net/mlx5/mlx5_stats.c=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20 @@ -8,6 +8,8 @@=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 #include =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20 #include #include +#include +#include #include #include @@ -139,20 +141,22 @@ static inline void mlx5_read_ib_stat(struct mlx5_priv *priv, const char *ctr_name, uint64_t *stat) { - FILE *file; + int fd; if (priv->sh) { MKSTR(path, "%s/ports/%d/hw_counters/%s", priv->sh->ibdev_path, priv->ibv_port, ctr_name); - file =3D fopen(path, "rb"); - if (file) { - int n =3D fscanf(file, "%" SCNu64, stat); - - fclose(file); - if (n =3D=3D 1) + fd =3D open(path, O_RDONLY); + if (fd !=3D -1) { + char buf[32]; + ssize_t n =3D read(fd, buf, sizeof(buf)); + close(fd); + if (n !=3D -1) { + sscanf(buf, "%lu", stat); return; + } } } *stat =3D 0; =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Patch attached to bug. These TLB shootdowns lead to packets being dropped sporadically by the mlx5 NICs with the "rx_discards_phy" counter being incremented. The above patch significantly reduced the drops in our app. --=20 You are receiving this mail because: You are the assignee for the bug.=