From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 0F653A0471 for ; Fri, 19 Jul 2019 06:41:06 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id B0A462BA8; Fri, 19 Jul 2019 06:41:04 +0200 (CEST) Received: from guri.nttv6.jp (guri.nttv6.jp [115.69.228.140]) by dpdk.org (Postfix) with ESMTP id 8ED4C231E for ; Fri, 19 Jul 2019 06:41:02 +0200 (CEST) Received: from z.nttv6.jp (z.nttv6.jp [IPv6:2402:c800:ff06:6::f]) by guri.nttv6.jp (NTTv6MTA) with ESMTP id 366B725F6BD; Fri, 19 Jul 2019 13:41:00 +0900 (JST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nttv6.jp; s=20180820; t=1563511260; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=6R1gQb/by+X+QtHAPoRLlOoCzzSmDpT4OPklrbmxpmY=; b=eQecSFL+wJW3Dm/5l/v9gxEniEG4NC3okggnUSITUzR1203LqQcSBFyIdmgkv1Mr2a94pD nQfG1XoPY9G7VdIn+JexQr0UGObwBF1ncpeB/WN/WmOdeF4Qyr23cUAO4vdVSdtmIt3XY+ ON0daEKgXz8i6Zl0Dt+HHRByyFLCpb8= Received: from localhost (fujiko.nttv6.jp [IPv6:2402:c800:ff06:136::141]) by z.nttv6.jp (NTTv6MTA) with ESMTPSA id 188CA75906F; Fri, 19 Jul 2019 13:41:00 +0900 (JST) Date: Fri, 19 Jul 2019 13:40:34 +0900 (JST) Message-Id: <20190719.134034.1626469017793816403.yasu@nttv6.jp> To: viacheslavo@mellanox.com Cc: dev@dpdk.org From: Yasuhiro Ohara In-Reply-To: References: <20190713.013853.751044529514409504.yasu@nttv6.jp> <20190719.111945.2086809368117346464.yasu@nttv6.jp> Organizaton: NTT Communications X-Mailer: Mew version 6.8 on Emacs 26.1 Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Authentication-Results: guri.nttv6.jp; spf=pass smtp.mailfrom=yasu@nttv6.jp Subject: Re: [dpdk-dev] ConnectX-4/mlx5 crashes around rxq_cqe_comp_en? X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi Slava, Thank you, I will file it. regards, Yasu From: Slava Ovsiienko Subject: Re: [dpdk-dev] ConnectX-4/mlx5 crashes around rxq_cqe_comp_en? Date: Fri, 19 Jul 2019 03:53:54 +0000 Message-ID: > Hi, Yasuhiro > > Could you, please, create the ticket in Bugzilla.dpdk.org to store the details? > The Rx CQE compression can be disabled with specifying "rxq_cqe_comp_en=0". > > WBR, Slava > >> -----Original Message----- >> From: dev On Behalf Of Yasuhiro Ohara >> Sent: Friday, July 19, 2019 5:20 >> To: dev@dpdk.org >> Subject: Re: [dpdk-dev] ConnectX-4/mlx5 crashes around rxq_cqe_comp_en? >> >> >> The same goes with DPDK-19.05 too. >> >> When crash happens, >> mcqe_n == t_pkt->data_len == 124. >> >> struct rte_mbuf **elts (which seems to be prepared somewhere) looks like >> it's supposed to contain valid mbufs, but (when under a significant load?) it >> doesn't. >> >> (gdb) p/x (void*[124])elts[0] >> $31 = {0x1d0bd0d80, 0x1d1cfef80, 0x1d28f6a40, 0x1d22eb100, >> 0x1d195a8c0, >> 0x1d2137200, 0x1d1eb5540, 0x1d1d0fec0, 0x1d28ecf40, 0x1d19b1bc0, >> 0x1cec8a200, 0x1d02e2980, 0x1d085cdc0, 0x1d04e8e00, 0x1ccb4e140, >> 0x1d1e17e80, 0x1d17a1c40, 0x1d14a6e00, 0x1d2871700, 0x1d20b6c40, >> 0x1d29831c0, 0x1d04941c0, 0x1d0921080, 0x1d070ea40, 0x1d148ea80, >> 0x1cee100c0, 0x1d1a47e40, 0x1d0ee6600, 0x1d02f1200, 0x1d24bc100, >> 0x1d1e84e40, 0x1d1e1f2c0, 0x1d28b7ac0, 0x1d2195940, 0x1d21bc540, >> 0x1d228f080, 0x1d1026100, 0x1d285e100, 0x1d211c7c0, 0x1d2128980, >> 0x1d1787200, 0x1d170e080, 0x1d1e0e380, 0x1ce638500, 0x1d21a6880, >> 0x1d20d8ac0, 0x1d25e8600, 0x1d2377880, 0x1d0e13ac0, 0x1c0c07100, >> 0x1c0c07100, 0x1c0c07100, 0x1c0c07100, 0x0, 0x0, 0x0, 0x0, >> 0x7ffff7ff487c, >> 0x1c0c06f00, 0x1c0c08b00, 0x0, 0x0, 0x7ffff7ff207c, 0x1, 0x1480, >> 0x140000000, 0x100000000, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, >> 0x40000000ffffffff, 0x4000000000000001, 0xcd64010000000002, >> 0x0 } >> >> (gdb) p elts[48] >> $38 = (struct rte_mbuf *) 0x1d0e13ac0 >> (gdb) p elts[49] >> $39 = (struct rte_mbuf *) 0x1c0c07100 >> (gdb) p elts[50] >> $40 = (struct rte_mbuf *) 0x1c0c07100 >> (gdb) p elts[51] >> $41 = (struct rte_mbuf *) 0x1c0c07100 >> (gdb) p elts[52] >> $42 = (struct rte_mbuf *) 0x1c0c07100 >> (gdb) p elts[53] >> $43 = (struct rte_mbuf *) 0x0 >> >> Any thoughts? >> >> regards, >> Yasu >> >> From: Yasuhiro Ohara >> Subject: [dpdk-dev] ConnectX-4/mlx5 crashes around rxq_cqe_comp_en? >> Date: Sat, 13 Jul 2019 01:38:53 +0900 (JST) >> Message-ID: <20190713.013853.751044529514409504.yasu@nttv6.jp> >> >> > >> > Hi, >> > >> > I get a crash when I put a significant amount of load on >> > ConnectX-4/mlx5, i.e., 50Gbps for 100GbE port. >> > >> > Thread 22 "lcore-slave-19" received signal SIGSEGV, Segmentation fault. >> > [Switching to Thread 0x7fffe77ee700 (LWP 33519)] >> > 0x0000555555f010a3 in _mm_storeu_si128 (__B=..., __P=0x10) >> > at /usr/lib/gcc/x86_64-linux-gnu/7/include/emmintrin.h:721 >> > 721 *__P = __B; >> > (gdb) bt >> > #0 0x0000555555f010a3 in _mm_storeu_si128 (__B=..., __P=0x10) >> > at /usr/lib/gcc/x86_64-linux-gnu/7/include/emmintrin.h:721 >> > #1 rxq_cq_decompress_v (rxq=0x22c910ccc0, cq=0x22c8fd1800, >> elts=0x22c910d240) >> > at >> > /usr/local/dpdk-stable-18.11.2/drivers/net/mlx5/mlx5_rxtx_vec_sse.h:42 >> > 1 >> > #2 0x0000555555f04b42 in rxq_burst_v (rxq=0x22c910ccc0, >> pkts=0x7fffe77eba40, >> > pkts_n=32, err=0x7fffe77dc978) >> > at >> > /usr/local/dpdk-stable-18.11.2/drivers/net/mlx5/mlx5_rxtx_vec_sse.h:95 >> > 6 >> > #3 0x0000555555f055ea in mlx5_rx_burst_vec (dpdk_rxq=0x22c910ccc0, >> > pkts=0x7fffe77eba40, pkts_n=32) >> > at >> > /usr/local/dpdk-stable-18.11.2/drivers/net/mlx5/mlx5_rxtx_vec.c:238 >> > #4 0x0000555555632772 in rte_eth_rx_burst (port_id=4, queue_id=5, >> > rx_pkts=0x7fffe77eba40, nb_pkts=32) >> > at >> > /usr/local/dpdk-18.11/x86_64-native-linuxapp-gcc/include/rte_ethdev.h: >> > 3879 >> > >> > My environments are: >> > >> > Ubuntu 18.04.2 LTS 4.15.0-50-generic >> > MLNX_OFED_LINUX-4.5-1.0.1.0-ubuntu18.04-x86_64 >> > fw_ver: 12.17.2020 >> > vendor_id: 0x02c9 >> > vendor_part_id: 4115 >> > hw_ver: 0x0 >> > board_id: LNR3270110033 >> > DPDK 18.11.2 >> > >> > It looks like the CQE compression is the crashing place. >> > >> > dpdk-stable-18.11.2/drivers/net/mlx5/mlx5_rxtx_vec_sse.h:956 >> > 953 /* Decompress the last CQE if compressed. */ >> > 954 if (comp_idx < MLX5_VPMD_DESCS_PER_LOOP && comp_idx == n) >> { >> > 955 assert(comp_idx == (nocmp_n % >> MLX5_VPMD_DESCS_PER_LOOP)); >> > 956 rxq_cq_decompress_v(rxq, &cq[nocmp_n], &elts[nocmp_n]); >> > >> > And I'm wondering how I can disable rxq_cqe_comp_en devargs. >> > >> > >> > > .dpdk.org%2Fguides- >> 18.02%2Fnics%2Fmlx5.html&data=02%7C01%7Cviaches >> > >> lavo%40mellanox.com%7C346e0b13aba945bce40808d70befa6e9%7Ca6529 >> 71c7d2e4 >> > >> d9ba6a4d149256f461b%7C0%7C0%7C636990996240384705&sdata=id >> %2FJY%2BM >> > PhzsUBmFP9YoKtFkcu%2FbJtO6Ntb1QghmnSdQ%3D&reserved=0> >> > 22.5.3. Run-time configuration >> > rxq_cqe_comp_en parameter [int] >> > >> > Any information or guesses are appreciated. >> > >> > Best regards, >> > Yasu >> > >