From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 95320A04DD; Thu, 19 Nov 2020 12:21:26 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 497AD3B5; Thu, 19 Nov 2020 12:21:24 +0100 (CET) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by dpdk.org (Postfix) with ESMTP id 52A0E2AB for ; Thu, 19 Nov 2020 12:21:22 +0100 (CET) IronPort-SDR: 6ZtsM12ctB8Jcl4B0OBDGr+/XPKxMvXVuf7DcFD8mHvjI6gMjsaOI5+bud18atRtA43DS4MpMI 5psEGJWR4X9g== X-IronPort-AV: E=McAfee;i="6000,8403,9809"; a="235422255" X-IronPort-AV: E=Sophos;i="5.77,490,1596524400"; d="scan'208";a="235422255" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Nov 2020 03:21:17 -0800 IronPort-SDR: T9rpF3v6B0CJp+rP78Z3iboDgyYrJa+g8RP9xqn/4B4RNH7Gsa6obBZMF60qR6jMiE6xz+gfSR 0iP+8h4W+WVw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,490,1596524400"; d="scan'208";a="311609742" Received: from irsmsx604.ger.corp.intel.com ([163.33.146.137]) by fmsmga007.fm.intel.com with ESMTP; 19 Nov 2020 03:21:17 -0800 Received: from irsmsx604.ger.corp.intel.com (163.33.146.137) by IRSMSX604.ger.corp.intel.com (163.33.146.137) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Thu, 19 Nov 2020 11:21:16 +0000 Received: from irsmsx604.ger.corp.intel.com ([163.33.146.137]) by IRSMSX604.ger.corp.intel.com ([163.33.146.137]) with mapi id 15.01.1713.004; Thu, 19 Nov 2020 11:21:16 +0000 From: "Stokes, Ian" To: "Alex Yeh (ayeh)" , "dev@dpdk.org" CC: "Yegappan Lakshmanan (yega)" Thread-Topic: [ovs-dev] ovs-vswitchd with DPDK crashed when guest VM restarts network service Thread-Index: Ada+FTJhUXeyjkIDTx6/zgtok5XJUQAT+syQ Date: Thu, 19 Nov 2020 11:21:16 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-reaction: no-action dlp-product: dlpe-windows dlp-version: 11.5.1.3 x-originating-ip: [163.33.253.164] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [ovs-dev] ovs-vswitchd with DPDK crashed when guest VM restarts network service X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" > Hi, > We are seeing a ovs-vswitchd service crash with segfault i= n the > librte_vhost library when a DPDK application within a guest VM is stopped= . >=20 > We are using OVS 2.11.1 on CentOS 7.6 (3.10.0-1062 Linux k= ernel) with > DPDK 18.11.2. Hi, Is there a reason you are using OVS 2.11.1 and DPDK 18.11.2? These are qui= te old. As a first step I would recommend using the latest of these branches that h= ave been validated with by the OVS community. As of now this would be OVS 2.11.4 and DPDK 18.11.9 to check if the issue i= s still present there my suspicion is that this could be an issue resolved = in the DPDK library since 18.11.2. Regards Ian >=20 > We are using OVS-DPDK on the host and the guest VM is runn= ing a DPDK > application. With some traffic, if the application service within the VM = is > restarted, then OVS crashes. >=20 > This crash is not seen if the guest VM is restarted (inste= ad of stopping > the application within the VM). >=20 > The crash trackback (attached below) points to the > rte_memcpy_generic() function in rte_memcpy.h. It looks like the crash oc= curs > when vhost is trying to dequeue the packets from the guest VM (as the > application in the guest VM has stopped and the huge pages are returned t= o the > guest kernel). >=20 > We have tried enabling iommu in ovs by setting > "other_config:vhost-iommu-support=3Dtrue" and enabling iommu in qemu usin= g > the following configuration in the guest domain XML: > > > > With iommu enabled ovs-vswitchd still crashes when guest V= M restarts > the network service. >=20 > Is this a known problem? Anyone else seen a crash like thi= s? How can > we protect the ovs-vswitchd from crashing when a guest VM restarts the > network application or service? >=20 > Thanks > Alex > ------------------------------------------------------------------------ >=20 > Log: > Oct 7 19:54:16 Branch81-Bravo kernel: [2245909.596635] pmd16[25721]: > segfault at 7f4d1d733000 ip 00007f4d2ae5d066 sp 00007f4d1ce65618 error 4 = in > librte_vhost.so.4[7f4d2ae52000+1a000] > Oct 7 19:54:19 Branch81-Bravo systemd[1]: ovs-vswitchd.service: main proc= ess > exited, code=3Dkilled, status=3D11/SEGV >=20 > Environment: > CentOs 7.6.1810 > openvswitch-2.11.1-1.el7.centos.x86_64 > openvswitch-kmod-2.11.1-1.el7.centos.x86_64 > dpdk-18.11-2.el7.centos.x86_64 > 3.10.0-1062.4.1.el7.x86_64 > qemu-kvm-ev-2.12.0-18.el7.centos_6.1.1 >=20 > Core dump trace: > (gdb) bt > #-1 0x00007ffff205602e in rte_memcpy_generic (dst=3D, > src=3D0x7fffcef3607c, n=3D) > at /usr/src/debug/dpdk-18.11/x86_64-native-linuxapp- > gcc/include/rte_memcpy.h:793 > Backtrace stopped: Cannot access memory at address 0x7ffff20558f0 >=20 > (gdb) list *0x00007ffff205602e > 0x7ffff205602e is in rte_memcpy_generic (/usr/src/debug/dpdk-18.11/x86_64= - > native-linuxapp-gcc/include/rte_memcpy.h:793). > 788 } > 789 > 790 /** > 791 * For copy with unaligned load > 792 */ > 793 MOVEUNALIGNED_LEFT47(dst, src, n, srcofs); > 794 > 795 /** > 796 * Copy whatever left > 797 */ >=20 > (gdb) list *0x00007ffff205c192 > 0x7ffff205c192 is in rte_vhost_dequeue_burst (/usr/src/debug/dpdk- > 18.11/lib/librte_vhost/virtio_net.c:1192). > 1187 * In zero copy mode, one mbuf can only reference data > 1188 * for one or partial of one desc buff. > 1189 */ > 1190 mbuf_avail =3D cpy_len; > 1191 } else { > 1192 if (likely(cpy_len > MAX_BATCH_LEN || > 1193 vq->batch_copy_nb_elems >=3D vq->size || > 1194 (hdr && cur =3D=3D m))) { > 1195 rte_memcpy(rte_pktmbuf_mtod_offset(cur, void *, > 1196 mbuf_offset), > (gdb) >=20 > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev