From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 62C8642DBB for ; Mon, 3 Jul 2023 03:18:28 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 2816C40EF0; Mon, 3 Jul 2023 03:18:28 +0200 (CEST) Received: from CNSHPPMGWESA01.NOKIA-SBELL.COM (cnshjsmin03.nokia-sbell.com [116.246.26.71]) by mails.dpdk.org (Postfix) with ESMTP id 7CD5540156 for ; Mon, 3 Jul 2023 03:18:25 +0200 (CEST) X-IronPort-AV: E=Sophos;i="6.01,177,1684771200"; d="scan'208,217";a="15961347" Received: from unknown (HELO CNSHPPEXCH1604.nsn-intra.net) ([135.251.51.104]) by CNSHPPMGWESA01.NOKIA-SBELL.COM with ESMTP; 03 Jul 2023 09:18:23 +0800 Received: from CNSHPPEXCH1601.nsn-intra.net (135.251.51.101) by CNSHPPEXCH1604.nsn-intra.net (135.251.51.104) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.34; Mon, 3 Jul 2023 09:18:22 +0800 Received: from CNSHPPEXCH1601.nsn-intra.net ([135.251.51.101]) by CNSHPPEXCH1601.nsn-intra.net ([135.251.51.101]) with mapi id 15.01.2375.034; Mon, 3 Jul 2023 09:18:22 +0800 From: "Xiaoping Yan (NSB)" To: "users@dpdk.org" , Matan Azrad , "dekelp@nvidia.com" Subject: RE: dpdk mlx5 driver crash in rxq_cq_decompress_v Thread-Topic: dpdk mlx5 driver crash in rxq_cq_decompress_v Thread-Index: AdmoBDmFrAg6d99hSi6vZg6OpqycVgFR+Kaw Date: Mon, 3 Jul 2023 01:18:22 +0000 Message-ID: <7902affaba614875bf33c6f88a43e1c6@nokia-sbell.com> References: <5376a9878f1c4eaeadef8591ea0d26cd@nokia-sbell.com> In-Reply-To: <5376a9878f1c4eaeadef8591ea0d26cd@nokia-sbell.com> Accept-Language: zh-CN, en-US Content-Language: en-US X-Mentions: matan@nvidia.com,dekelp@nvidia.com X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [135.251.51.115] Content-Type: multipart/alternative; boundary="_000_7902affaba614875bf33c6f88a43e1c6nokiasbellcom_" MIME-Version: 1.0 X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: users-bounces@dpdk.org --_000_7902affaba614875bf33c6f88a43e1c6nokiasbellcom_ Content-Type: text/plain; charset="gb2312" Content-Transfer-Encoding: base64 SGksDQoNCkAnZGVrZWxwQG52aWRpYS5jb20nPG1haWx0bzpkZWtlbHBAbnZpZGlhLmNvbT5AJ01h dGFuIEF6cmFkJzxtYWlsdG86bWF0YW5AbnZpZGlhLmNvbT4gQ2FuIHlvdSBraW5kbHkgc3VnZ2Vz dD8NClRoYW5rIHlvdS4NCg0KQnIsIFhpYW9waW5nDQoNCkZyb206IFhpYW9waW5nIFlhbiAoTlNC KQ0KU2VudDogMjAyM8TqNtTCMjfI1SAxMjoxMQ0KVG86IHVzZXJzQGRwZGsub3JnOyAnTWF0YW4g QXpyYWQnIDxtYXRhbkBudmlkaWEuY29tPjsgJ2Rla2VscEBudmlkaWEuY29tJyA8ZGVrZWxwQG52 aWRpYS5jb20+DQpTdWJqZWN0OiBkcGRrIG1seDUgZHJpdmVyIGNyYXNoIGluIHJ4cV9jcV9kZWNv bXByZXNzX3YNCg0KSGksDQoNCmRwZGsgdmVyc2lvbiBpbiB1c2U6IDIxLjExLjINCg0KTWx4NSBk cml2ZXIgY3Jhc2hlcyBpbiByeHFfY3FfZGVjb21wcmVzc192IGluIHRyYWZmaWMgdGVzdCBhZnRl ciBzZXZlcmFsIG1pbnV0ZXMuDQpTdGFjayB0cmFjZToNCihnZGIpIGJ0DQojMCAgMHgwMDAwN2Zm ZmY1ODYxMmJjIGluIF9tbV9zdG9yZXVfc2kxMjggKF9fQj0uLi4sIF9fUD08b3B0aW1pemVkIG91 dD4pDQogICAgYXQgL3Vzci9saWIvZ2NjL3g4Nl82NC1yZWRoYXQtbGludXgvMTIvaW5jbHVkZS9l bW1pbnRyaW4uaDo3MzkNCiMxICByeHFfY3FfZGVjb21wcmVzc192IChyeHE9cnhxQGVudHJ5PTB4 MmFiZTU1OTJmNDAsIGNxPWNxQGVudHJ5PTB4MmFiZTU0ZmRiMDAsIGVsdHM9ZWx0c0BlbnRyeT0w eDJhYmU1NTk0NjM4KQ0KICAgIGF0IC4uL2RwZGstMjEuMTEvZHJpdmVycy9uZXQvbWx4NS9tbHg1 X3J4dHhfdmVjX3NzZS5oOjE0Mg0KIzIgIDB4MDAwMDdmZmZmNTg2MmM4NCBpbiByeHFfYnVyc3Rf diAobm9fY3E9PHN5bnRoZXRpYyBwb2ludGVyPiwgZXJyPTB4N2ZmZmZmZmZiODQ4LCBwa3RzX249 NCwgcGt0cz08b3B0aW1pemVkIG91dD4sDQogICAgcnhxPTB4MmFiZTU1OTJmNDApIGF0IC4uL2Rw ZGstMjEuMTEvZHJpdmVycy9uZXQvbWx4NS9tbHg1X3J4dHhfdmVjLmM6MzQ5DQojMyAgbWx4NV9y eF9idXJzdF92ZWMgKGRwZGtfcnhxPTB4MmFiZTU1OTJmNDAsIHBrdHM9MHg3ZmZmZmZmZmJmODAs IHBrdHNfbj0zMikgYXQgLi4vZHBkay0yMS4xMS9kcml2ZXJzL25ldC9tbHg1L21seDVfcnh0eF92 ZWMuYzozOTMNCiM0ICAweDAwMDA1NTU1NTU2YTBmNDEgaW4gcnRlX2V0aF9yeF9idXJzdCAobmJf cGt0cz0zMiwgcnhfcGt0cz0weDdmZmZmZmZmYmY4MCwgcXVldWVfaWQ9MCwgcG9ydF9pZD0xKQ0K ICAgIGF0IC91c3IvaW5jbHVkZS9ydGVfZXRoZGV2Lmg6NTcyMQ0Koa0NCkF0dGFjaGVkIGlzIHRo ZSBlcnJvciBsb2cgobBVbmV4cGVjdGVkIENRRSBlcnJvciBzeW5kcm9tZaGtobEgYW5kIGR1bXAg ZmlsZQ0KDQpJIGZvdW5kIHRoZXJlIHdhcyBhIHNpbWlsYXIgYnVnIGhlcmU6IGh0dHBzOi8vYnVn cy5kcGRrLm9yZy9zaG93X2J1Zy5jZ2k/aWQ9MzM0DQpCdXQgdGhlIGZpeCAoODhjMDczMzUzNWQ2 IGV4dGVuZCBSeCBjb21wbGV0aW9uIHdpdGggZXJyb3IgaGFuZGxpbmcpIHNob3VsZCBhbHJlYWR5 IGJlZW4gaW5jbHVkZWQsIGFzIEmhr20gdXNpbmcgMjEuMTEuMg0KQWxzbyBiZWxvdyBjb21taXQg KGZpeCB0byA4OGMwNzMzNTM1ZDYpIGlzIGFscmVhZHkgaW5jbHVkZWQgaW4gbXkgZHBkayB2ZXJz aW9uLg0KY29tbWl0IDYwYjI1NGUzOTIzZDAwN2JjYWRiYjhkNDEwZjk1YWQ4OWEyZjEzZmENCkF1 dGhvcjogTWF0YW4gQXpyYWQgbWF0YW5AbnZpZGlhLmNvbTxtYWlsdG86bWF0YW5AbnZpZGlhLmNv bT4NCkRhdGU6ICAgVGh1IEF1ZyAxMSAxOTo1MTo1NSAyMDIyICswMzAwDQoNCiAgICBuZXQvbWx4 NTogZml4IFJ4IHF1ZXVlIHJlY292ZXJ5IG1lY2hhbmlzbQ0KDQpBbnkgc3VnZ2VzdGlvbj8NClRo YW5rIHlvdS4NCg0KQnIsIFhpYW9waW5nDQoNCg== --_000_7902affaba614875bf33c6f88a43e1c6nokiasbellcom_ Content-Type: text/html; charset="gb2312" Content-Transfer-Encoding: quoted-printable

Hi,

 

@'dekelp@nvidia.com'@'Matan Azrad' Can you kindly suggest?

Thank you.

 

Br, Xiaoping<= /p>

 

From: Xiaoping Yan (NSB)
Sent: 2023
=C4=EA6=D4=C227=C8=D5 12:11
To: users@dpdk.org; 'Matan Azrad' <matan@nvidia.com>; 'dekelp@= nvidia.com' <dekelp@nvidia.com>
Subject: dpdk mlx5 driver crash in rxq_cq_decompress_v

 

Hi,

 

dpdk version in use: 21.11.2

 

Mlx5 driver crashes in rxq_cq_d= ecompress_v in traffic test after several minutes.

Stack trace:<= /p>

(gdb) bt

#0  0x00007ffff58612bc in _mm_storeu_si128= (__B=3D..., __P=3D<optimized out>)

    at /usr/lib/gcc/x86_64-redha= t-linux/12/include/emmintrin.h:739

#1  rxq_cq_decompress_v (rxq=3Drxq@entry= =3D0x2abe5592f40, cq=3Dcq@entry=3D0x2abe54fdb00, elts=3Delts@entry=3D0x2abe= 5594638)

    at ../dpdk-21.11/drivers/net= /mlx5/mlx5_rxtx_vec_sse.h:142

#2  0x00007ffff5862c84 in rxq_burst_v (no_= cq=3D<synthetic pointer>, err=3D0x7fffffffb848, pkts_n=3D4, pkts=3D&l= t;optimized out>,

    rxq=3D0x2abe5592f40) at= ../dpdk-21.11/drivers/net/mlx5/mlx5_rxtx_vec.c:349

#3  mlx5_rx_burst_vec (dpdk_rxq=3D0x2abe55= 92f40, pkts=3D0x7fffffffbf80, pkts_n=3D32) at ../dpdk-21.11/drivers/net/mlx= 5/mlx5_rxtx_vec.c:393

#4  0x00005555556a0f41 in rte_eth_rx_burst= (nb_pkts=3D32, rx_pkts=3D0x7fffffffbf80, queue_id=3D0, port_id=3D1)

    at /usr/include/rte_ethdev.h= :5721

=A1=AD

Attached is the error log =A1=B0Unexpected CQE error syndrome=A1=AD=A1= =B1 and dump file

 

I found there was a similar bug= here: https://bugs.dpdk.org/show_bug.cgi?id=3D334

But the fix (88c0733535d6 exten= d Rx completion with error handling) should already been included, as I=A1=AFm using 21.11.2

Also below commit (fix to 88c07= 33535d6) is already included in my dpdk version.

commit 60b254e3923d007bcadbb8d410f95ad89a2f13fa=

Author: Matan Azrad matan@nvidia.com<= /i>

Date:   Thu Aug 11 19:51:55 2022 += ;0300

 

    net/mlx5: fix Rx queue recov= ery mechanism

 

Any suggestion?

Thank you.

 

Br, Xiaoping<= /p>

 

--_000_7902affaba614875bf33c6f88a43e1c6nokiasbellcom_--