From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtprelay07.ispgateway.de (smtprelay07.ispgateway.de [134.119.228.103]) by dpdk.org (Postfix) with ESMTP id 767531B1B2 for ; Tue, 26 Sep 2017 11:23:46 +0200 (CEST) Received: from [146.52.109.75] (helo=nb-martin.allegro) by smtprelay07.ispgateway.de with esmtpsa (TLSv1.2:ECDHE-RSA-AES128-GCM-SHA256:128) (Exim 4.89) (envelope-from ) id 1dwm5h-0003TS-Hb; Tue, 26 Sep 2017 11:23:45 +0200 To: Adrien Mazarguil , Nelio Laranjeiro Cc: "dev@dpdk.org" From: Martin Weiser Message-ID: <5d1f07c4-5933-806d-4d11-8fdfabc701d7@allegro-packets.com> Date: Tue, 26 Sep 2017 11:23:45 +0200 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:52.0) Gecko/20100101 Thunderbird/52.3.0 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="------------2A707CA3EA47338DC5C97577" Content-Language: en-US X-Df-Sender: bWFydGluLndlaXNlckBhbGxlZ3JvLXBhY2tldHMuY29t Subject: [dpdk-dev] Mellanox ConnectX-5 crashes and mbuf leak X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Sep 2017 09:23:46 -0000 This is a multi-part message in MIME format. --------------2A707CA3EA47338DC5C97577 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Hi, we are currently testing the Mellanox ConnectX-5 100G NIC with DPDK 17.08 as well as dpdk-net-next and are experiencing mbuf leaks as well as crashes (and in some instances even kernel panics in a mlx5 module) under certain load conditions. We initially saw these issues only in our own DPDK-based application and it took some effort to reproduce this in one of the DPDK example applications. However with the attached patch to the load-balancer example we can reproduce the issues reliably. The patch may look weird at first but I will explain why I made these changes: * the sleep introduced in the worker threads simulates heavy processing which causes the software rx rings to fill =C2=A0 up under load. If the rings are large enough (I increased the ring= size with the load-balancer command line option =C2=A0 as you can see in the example call further down) the mbuf pool may= run empty and I believe this leads to a malfunction =C2=A0 in the mlx5 driver. As soon as this happens the NIC will stop forwarding traffic, probably because the driver =C2=A0 cannot allocate mbufs for the packets received by the NIC. Unfortunately when this happens most of the mbufs will =C2=A0 never return to the mbuf pool so that even when the traffic stops = the pool will remain almost empty and the =C2=A0 application will not forward traffic even at a very low rate. * the use of the reference count in the mbuf in addition to the situation described above is what makes the =C2=A0 mlx5 DPDK driver crash almost immediately under load. In our application we rely on this feature to be able to forward =C2=A0 the packet quickly and still send the packet to a worker thread fo= r analysis and finally free the packet when analysis is =C2=A0 done. Here I simulated this by increasing the mbuf reference count= immediately after receiving the mbuf from the =C2=A0 driver and then calling rte_pktmbuf_free in the worker thread whic= h should only decrement the reference count again =C2=A0 and not actually free the mbuf. We executed the patched load-balancer application with the following command line: =C2=A0=C2=A0=C2=A0 ./build/load_balancer -l 3-7 -n 4 -- --rx "(0,0,3),(1,= 0,3)" --tx "(0,3),(1,3)" --w "4" --lpm "16.0.0.0/8=3D>0; 48.0.0.0/8=3D>1;" --pos-lb = 29 --rsz "1024, 32768, 1024, 1024" Then we generated traffic using the t-rex traffic generator and the sfr test case. On our machine the issues start to happen when the traffic exceeds ~6 Gbps but this may vary depending on how powerful the test machine is (by the way we were able to reproduce this on different types of hardware). A typical stacktrace looks like this: =C2=A0=C2=A0=C2=A0 Thread 1 "load_balancer" received signal SIGSEGV, Segm= entation fault. =C2=A0=C2=A0=C2=A0 0x0000000000614475 in _mm_storeu_si128 (__B=3D..., __P= =3D) at /usr/lib/gcc/x86_64-linux-gnu/5/include/emmintrin.h:716 =C2=A0=C2=A0=C2=A0 716=C2=A0=C2=A0=C2=A0 =C2=A0 __builtin_ia32_storedqu (= (char *)__P, (__v16qi)__B); =C2=A0=C2=A0=C2=A0 (gdb) bt =C2=A0=C2=A0=C2=A0 #0=C2=A0 0x0000000000614475 in _mm_storeu_si128 (__B=3D= =2E.., __P=3D) at /usr/lib/gcc/x86_64-linux-gnu/5/include/emmintrin.h:716 =C2=A0=C2=A0=C2=A0 #1=C2=A0 rxq_cq_decompress_v (elts=3D0x7fff3732bef0, c= q=3D0x7ffff7f99380, rxq=3D0x7fff3732a980) at /root/dpdk-next-net/drivers/net/mlx5/mlx5_rxtx_vec_sse.c:679 =C2=A0=C2=A0=C2=A0 #2=C2=A0 rxq_burst_v (pkts_n=3D, pkts=3D= 0xa7c7b0 , rxq=3D0x7fff3732a980) at /root/dpdk-next-net/drivers/net/mlx5/mlx5_rxtx_vec_sse.c:1242 =C2=A0=C2=A0=C2=A0 #3=C2=A0 mlx5_rx_burst_vec (dpdk_rxq=3D0x7fff3732a980,= pkts=3D, pkts_n=3D) at /root/dpdk-next-net/drivers/net/mlx5/mlx5_rxtx_vec_sse.c:1277 =C2=A0=C2=A0=C2=A0 #4=C2=A0 0x000000000043c11d in rte_eth_rx_burst (nb_pk= ts=3D3599, rx_pkts=3D0xa7c7b0 , queue_id=3D0, port_id=3D0 '\000') =C2=A0=C2=A0=C2=A0 at /root/dpdk-next-net//x86_64-native-linuxapp-gcc/include/rte_ethdev.h:2781= =C2=A0=C2=A0=C2=A0 #5=C2=A0 app_lcore_io_rx (lp=3Dlp@entry=3D0xa7c700 , n_workers=3Dn_workers@entry=3D1, bsz_rd=3Dbsz_rd@entry=3D144, bsz_wr=3Dbsz_wr@entry=3D144, pos_lb=3Dpos_lb@entry=3D29 '\035') =C2=A0=C2=A0=C2=A0 at /root/dpdk-next-net/examples/load_balancer/runtime.= c:198 =C2=A0=C2=A0=C2=A0 #6=C2=A0 0x0000000000447dc0 in app_lcore_main_loop_io = () at /root/dpdk-next-net/examples/load_balancer/runtime.c:485 =C2=A0=C2=A0=C2=A0 #7=C2=A0 app_lcore_main_loop (arg=3D) a= t /root/dpdk-next-net/examples/load_balancer/runtime.c:669 =C2=A0=C2=A0=C2=A0 #8=C2=A0 0x0000000000495e8b in rte_eal_mp_remote_launc= h () =C2=A0=C2=A0=C2=A0 #9=C2=A0 0x0000000000441e0d in main (argc=3D, argv=3D) at /root/dpdk-next-net/examples/load_balancer/main.c:99 The crash does not always happen at the exact same spot but in our tests always in the same function. In a few instances instead of an application crash the system froze completely with what appeared to be a kernel panic. The last output looked like a crash in the interrupt handler of a mlx5 module but unfortunately I cannot provide the exact output right now. All tests were performed under Ubuntu 16.04 server running a 4.4.0-96-generic kernel and the lasted Mellanox OFED MLNX_OFED_LINUX-4.1-1.0.2.0-ubuntu16.04-x86_64 was used. Any help with this issue is greatly appreciated. Best regards, Martin --------------2A707CA3EA47338DC5C97577 Content-Type: text/plain; charset=UTF-8; x-mac-type="0"; x-mac-creator="0"; name="test.patch" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="test.patch" ZGlmZiAtLWdpdCBhL2NvbmZpZy9jb21tb25fYmFzZSBiL2NvbmZpZy9jb21tb25fYmFzZQpp bmRleCA0MzlmM2NjLi4xMmI3MWU5IDEwMDY0NAotLS0gYS9jb25maWcvY29tbW9uX2Jhc2UK KysrIGIvY29uZmlnL2NvbW1vbl9iYXNlCkBAIC0yMjAsNyArMjIwLDcgQEAgQ09ORklHX1JU RV9MSUJSVEVfTUxYNF9UWF9NUF9DQUNIRT04CiAjCiAjIENvbXBpbGUgYnVyc3Qtb3JpZW50 ZWQgTWVsbGFub3ggQ29ubmVjdFgtNCAmIENvbm5lY3RYLTUgKE1MWDUpIFBNRAogIwotQ09O RklHX1JURV9MSUJSVEVfTUxYNV9QTUQ9bgorQ09ORklHX1JURV9MSUJSVEVfTUxYNV9QTUQ9 eQogQ09ORklHX1JURV9MSUJSVEVfTUxYNV9ERUJVRz1uCiBDT05GSUdfUlRFX0xJQlJURV9N TFg1X1RYX01QX0NBQ0hFPTgKCmRpZmYgLS1naXQgYS9leGFtcGxlcy9sb2FkX2JhbGFuY2Vy L3J1bnRpbWUuYyBiL2V4YW1wbGVzL2xvYWRfYmFsYW5jZXIvcnVudGltZS5jCmluZGV4IGU1 NGI3ODUuLmQ0NDgxMDAgMTAwNjQ0Ci0tLSBhL2V4YW1wbGVzL2xvYWRfYmFsYW5jZXIvcnVu dGltZS5jCisrKyBiL2V4YW1wbGVzL2xvYWRfYmFsYW5jZXIvcnVudGltZS5jCkBAIC00MSw2 ICs0MSw3IEBACiAjaW5jbHVkZSA8c3RkYXJnLmg+CiAjaW5jbHVkZSA8ZXJybm8uaD4KICNp bmNsdWRlIDxnZXRvcHQuaD4KKyNpbmNsdWRlIDx1bmlzdGQuaD4KCiAjaW5jbHVkZSA8cnRl X2NvbW1vbi5oPgogI2luY2x1ZGUgPHJ0ZV9ieXRlb3JkZXIuaD4KQEAgLTEzMyw2ICsxMzQs OCBAQCBhcHBfbGNvcmVfaW9fcnhfYnVmZmVyX3RvX3NlbmQgKAogCXVpbnQzMl90IHBvczsK IAlpbnQgcmV0OwoKKwlydGVfcGt0bWJ1Zl9yZWZjbnRfdXBkYXRlKG1idWYsIDEpOworCiAJ cG9zID0gbHAtPnJ4Lm1idWZfb3V0W3dvcmtlcl0ubl9tYnVmczsKIAlscC0+cngubWJ1Zl9v dXRbd29ya2VyXS5hcnJheVtwb3MgKytdID0gbWJ1ZjsKIAlpZiAobGlrZWx5KHBvcyA8IGJz eikpIHsKQEAgLTUyMSw2ICs1MjQsOCBAQCBhcHBfbGNvcmVfd29ya2VyKAogCQljb250aW51 ZTsKICNlbmRpZgoKKwkJdXNsZWVwKDIwKTsKKwogCQlBUFBfV09SS0VSX1BSRUZFVENIMShy dGVfcGt0bWJ1Zl9tdG9kKGxwLT5tYnVmX2luLmFycmF5WzBdLCB1bnNpZ25lZCBjaGFyICop KTsKIAkJQVBQX1dPUktFUl9QUkVGRVRDSDAobHAtPm1idWZfaW4uYXJyYXlbMV0pOwoKQEAg LTUzMCw2ICs1MzUsOCBAQCBhcHBfbGNvcmVfd29ya2VyKAogCQkJdWludDMyX3QgaXB2NF9k c3QsIHBvczsKIAkJCXVpbnQzMl90IHBvcnQ7CgorCQkJcnRlX3BrdG1idWZfZnJlZShscC0+ bWJ1Zl9pbi5hcnJheVtqXSk7CisKIAkJCWlmIChsaWtlbHkoaiA8IGJzel9yZCAtIDEpKSB7 CiAJCQkJQVBQX1dPUktFUl9QUkVGRVRDSDEocnRlX3BrdG1idWZfbXRvZChscC0+bWJ1Zl9p bi5hcnJheVtqKzFdLCB1bnNpZ25lZCBjaGFyICopKTsKIAkJCX0K --------------2A707CA3EA47338DC5C97577--