From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from dpdk.org (dpdk.org [92.243.14.124])
	by inbox.dpdk.org (Postfix) with ESMTP id D62B1A04DD;
	Thu, 19 Nov 2020 04:54:53 +0100 (CET)
Received: from [92.243.14.124] (localhost [127.0.0.1])
	by dpdk.org (Postfix) with ESMTP id B070AC86E;
	Thu, 19 Nov 2020 04:54:31 +0100 (CET)
Received: from alln-iport-6.cisco.com (alln-iport-6.cisco.com [173.37.142.93])
 by dpdk.org (Postfix) with ESMTP id 6418F2C2E
 for <dev@dpdk.org>; Thu, 19 Nov 2020 02:44:35 +0100 (CET)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;
 d=cisco.com; i=@cisco.com; l=11935; q=dns/txt;
 s=iport; t=1605750275; x=1606959875;
 h=from:to:cc:subject:date:message-id:mime-version;
 bh=Y6VSO7mnsIZiuYnjG5p0SGsoca7ymVdunBqTiTAUktA=;
 b=AZw2j3vXZqQE5k43bjg1VV/+nR0ymqWocF45boPAmvtvu2xZtrJF9SH7
 HknjI8/aBBU4f6p3nXTSxYqEpUOTdjvK6HXL8BGabAmoDxXFUEvsqc2UW
 UL8aDFn0otT6sJo1EAL/9h/mZs9OJ9o9HFeOac3YiGsQNqywniAZXasSX U=;
X-IPAS-Result: =?us-ascii?q?A0DRBgBhzLVffYwNJK1igQmBT4EjL1F7WS8uCod8A6Fvh?=
 =?us-ascii?q?HCBLoElA1QLAQEBDQEBLQIEAQGESgKCJQIlNQgOAgMBAQEDAgMBAQEBBQEBA?=
 =?us-ascii?q?QIBBgQUAQGGPAyFdRYVBhMBATcBEQGBACYBBAENDRqDBYF+VwMuAaQAAoE8i?=
 =?us-ascii?q?Gh0gQEzgwQBAQWFAxiCEAmBOIJzik0bgUE/gRFDh02DSIIsiTeHDgeKPp0vC?=
 =?us-ascii?q?oJtmzuhepNToFcCBAIEBQIOAQEFgVYBNSyBLXAVO4JpUBcCDY4rF4NOilh0N?=
 =?us-ascii?q?wIGCgEBAwl8jDsBgRABAQ?=
IronPort-PHdr: =?us-ascii?q?9a23=3AOhXyBhWqoQwsLpnZDWOz9TZSqvzV8LGuZFwc94?=
 =?us-ascii?q?YnhrRSc6+q45XlOgnF6O5wiEPSBNyBuf5ch+mQtLrvCiQM4peE5XYFdpEEFx?=
 =?us-ascii?q?oIkt4fkAFoBsmZQVb6I/jnY21ffoxCWVZp8mv9PR1TH8DzNFHKrn706iQdSV?=
 =?us-ascii?q?3zMANvLbHzHYjfx828y+G1/cjVZANFzDqwaL9/NlO4twLU48IXmoBlbK02z0?=
 =?us-ascii?q?jE?=
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-AV: E=Sophos;i="5.77,489,1596499200"; 
 d="scan'208,217";a="630600784"
Received: from alln-core-7.cisco.com ([173.36.13.140])
 by alln-iport-6.cisco.com with ESMTP/TLS/DHE-RSA-SEED-SHA;
 19 Nov 2020 01:44:32 +0000
Received: from XCH-RCD-005.cisco.com (xch-rcd-005.cisco.com [173.37.102.15])
 by alln-core-7.cisco.com (8.15.2/8.15.2) with ESMTPS id 0AJ1iWDb020141
 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=FAIL);
 Thu, 19 Nov 2020 01:44:32 GMT
Received: from xhs-aln-003.cisco.com (173.37.135.120) by XCH-RCD-005.cisco.com
 (173.37.102.15) with Microsoft SMTP Server (TLS) id 15.0.1497.2;
 Wed, 18 Nov 2020 19:44:32 -0600
Received: from xhs-aln-001.cisco.com (173.37.135.118) by xhs-aln-003.cisco.com
 (173.37.135.120) with Microsoft SMTP Server (TLS) id 15.0.1497.2;
 Wed, 18 Nov 2020 19:44:32 -0600
Received: from NAM10-MW2-obe.outbound.protection.outlook.com (173.37.151.57)
 by xhs-aln-001.cisco.com (173.37.135.118) with Microsoft SMTP Server (TLS) id
 15.0.1497.2 via Frontend Transport; Wed, 18 Nov 2020 19:44:31 -0600
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none;
 b=WXdWWTX+sNqh6sua1fhCHP7A7wHcNz2kGeaOlClNBQToMJKDY2+7AOhdskw7hJ9NlLeZ19qrgMHJwdPUZHQl3fRxWXQFyKCDjiQss5rPOUHMgpuhRxGfIVAZ/+64sBzt+Kvs3CFyUwojyEQoRWhJpzksLsJt4IiAf+ASXZs5zryWBrOiKf4fwIJtPv8OsDi/CQZOZwi1ikTvXbHqSqDwOI6TiMlwmBiYP+1ExYbuETcQOBRjZ7pCTv2fu2U3r2WOpR+KaXJdPTGUPE4uye9yc9YS4UPg8inmdwxzWzGbSKvpVLNGge4iJWlwK74NtyYGd8FqOa8An914BaPnqn4jQA==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; 
 s=arcselector9901;
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;
 bh=MPQBhT/4D0vgX2U6dw7BtLQlvd93NLp/R3mKVV+hvJg=;
 b=kP8ZO+2fVhBZZtxRLkfVAJ7m1VLPFxjkHhuOsPZxMHx3khqr8xlcMf+cVExiY47BIZxStNAGbvKv/Uj3yVg00zGFD2n75/DBjG0k8u7/PZRQ07x07Gvv8HllQvoY3u+nrRQwviPDrhExm20KBtsGPazw8kSMcLyc/JzvelV67Ai5wk9vjR2avvDKtvw3WFDjOjsfoO+VwyFw3A7ftMW/I3mx5Mg3s/UFocLcmj4QAaYC49KcSbLbN+onGO0p5fo+Mom699nwOYRf2mbST5OHEa+65qsqsF7/AG+kBucsoMFwqiJSZJ6WEo8bh/KUvwiV9MmXqlpsktgAA2jAJf5aIQ==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass
 smtp.mailfrom=cisco.com; dmarc=pass action=none header.from=cisco.com;
 dkim=pass header.d=cisco.com; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cisco.onmicrosoft.com; 
 s=selector2-cisco-onmicrosoft-com;
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;
 bh=MPQBhT/4D0vgX2U6dw7BtLQlvd93NLp/R3mKVV+hvJg=;
 b=fcbs/xt1Z0YFTmi3vtlG4KlwLZe6RlBuQPfkfhSIDYXN4NCnDCNtB4sTGRzIrLI+j2ZJCzqP20RPkIdjofA2OTsGCUybCYnGgyEItctjWp4lsQF9Z9SOgKIAkEdjoS5bX7WS2pzYn0yE172UNM3wYi7yl+tqmxUwrFPuCtm3Q/Y=
Received: from BY5PR11MB4056.namprd11.prod.outlook.com (2603:10b6:a03:18c::17)
 by SJ0PR11MB5182.namprd11.prod.outlook.com (2603:10b6:a03:2ae::13)
 with Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3589.22; Thu, 19 Nov
 2020 01:44:31 +0000
Received: from BY5PR11MB4056.namprd11.prod.outlook.com
 ([fe80::a556:7843:c77:936a]) by BY5PR11MB4056.namprd11.prod.outlook.com
 ([fe80::a556:7843:c77:936a%5]) with mapi id 15.20.3589.020; Thu, 19 Nov 2020
 01:44:31 +0000
From: "Alex Yeh (ayeh)" <ayeh@cisco.com>
To: "dev@dpdk.org" <dev@dpdk.org>, "ovs-dev@openvswitch.org"
 <ovs-dev@openvswitch.org>
CC: "Yegappan Lakshmanan (yega)" <yega@cisco.com>
Thread-Topic: ovs-vswitchd with DPDK crashed when guest VM restarts network
 service
Thread-Index: Ada+FTJhUXeyjkIDTx6/zgtok5XJUQ==
Date: Thu, 19 Nov 2020 01:44:31 +0000
Message-ID: <BY5PR11MB4056B227C2D7FF2F5DFC8234D5E00@BY5PR11MB4056.namprd11.prod.outlook.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
authentication-results: dpdk.org; dkim=none (message not signed)
 header.d=none;dpdk.org; dmarc=none action=none header.from=cisco.com;
x-originating-ip: [76.21.78.244]
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: 85e78fe2-3aaf-4a7f-9481-08d88c2ca8c9
x-ms-traffictypediagnostic: SJ0PR11MB5182:
x-ms-exchange-transport-forked: True
x-microsoft-antispam-prvs: <SJ0PR11MB5182D36F042DC4E12A23C000D5E00@SJ0PR11MB5182.namprd11.prod.outlook.com>
x-ms-oob-tlc-oobclassifiers: OLM:10000;
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: ljHEtCTv99o6c1Y8+icEJ7X+gR58KBtA5mQF6SQkuNMtndeOQascWUv+FuE03MeBfWCcN9YxcrRDB4caAhrZu6aLa+MS+6Dx8QRKrfeBwIfpE2a6cj9NhbyAJ2SjaFslYnHPklZoa/9COayfzHpRKVt8V7w1jQLGtHK39RfgRTFbgH+hei4cEYywM8aRjc6iT4WsiB+9L+DgHJBWgYgUoJxnOYFUClf1bmAI6LZXY9kEFRNGrEXR7o5iNL4PUKzh0KJETpEA9gQ1C1dVvfngbTcWrssq/Ij+v3kUnUM9yAaTuh/WNzwh1G1EVe8qJLVG
x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:;
 IPV:NLI; SFV:NSPM; H:BY5PR11MB4056.namprd11.prod.outlook.com; PTR:; CAT:NONE;
 SFS:(136003)(346002)(376002)(396003)(39860400002)(366004)(2906002)(9686003)(64756008)(66446008)(8676002)(66946007)(316002)(66476007)(66556008)(33656002)(6506007)(26005)(71200400001)(4326008)(8936002)(478600001)(107886003)(86362001)(83380400001)(76116006)(5660300002)(7696005)(9326002)(186003)(55016002)(52536014)(110136005);
 DIR:OUT; SFP:1101; 
x-ms-exchange-antispam-messagedata: WgTCZe8fehO2L6RHQMKN6+OzisARQ2ISrkzGjoEhS54H8bva57UtelH2uEbMtM5c8R1Oxn5P70MRUAnKOYk8r2Qk+NJ8aZRcn9DUduzD2IAgKy4e7pPJNSRZz2tRt+grIcywngaBtCwk1WJ93Y8qejoEaaB2XIG4ieK34e0YdRNuwAe8Q2GOKzuKL2dUHswjb4IaXAadw/bcfL/44mTecHR/edi5adj0KviogEfh/5a853mcmKX7eT3vDS7a7EPffvguDEayeeeWcxdKWLbJlMVajLlHIde24IpXaOz42U660z+4ssOm+t8hzk+Gi2ihL2fL4tpZFaRbisqAEI8Uzg25SeJJirCMrTsQUz8lJXOZdHvv/kSuzjHUnxApNxQBJiwng+PwkmCVv5vl5dkUuZIwaS8fzBLrkYFOmUhUlaEqhJzECZq19Ij2mp7NZjqlexII3sGHyYL56iuSvRxBjyOQbhQcANYc/J3BsvBlhMXn56g8tMN/wU/XxvtzARsZFYVymXTqmv5LU9rcEIy/eYkdH1vIzaX9IZk4F1KJfVxU5X1yhhJTu48TrgCVA3TspfMAe3nmF1G9P+u1906k5XC/pmskKYtSgSNR20GiOpm4UFbQ2t75ailPkg2K/xydpadBcKrBePEYrNMroZNTlEAvRDHDKeQSETgpaJ+gbmSApNkkZh+EPcqmiN9cDVQpahKeKr6yN3vIKLJHSgTHR9LTc7qAw0lMjrXpkHOfneWnCReERvWgu0SYWI25W/7hnCSa7FgmkQwLkIlXhfF8oVqFuzQwLiUbHWe5K2aLU1PSUihs2UJijelJeIvMo9Jnjylk4F62n/RpOctWms2DAEUyf0mc592K6iARq+DPD2rxaoBJfcUwPrAhkWGim+3LDhxHgNRNcJmvJsVVVp7jDQ==
MIME-Version: 1.0
X-MS-Exchange-CrossTenant-AuthAs: Internal
X-MS-Exchange-CrossTenant-AuthSource: BY5PR11MB4056.namprd11.prod.outlook.com
X-MS-Exchange-CrossTenant-Network-Message-Id: 85e78fe2-3aaf-4a7f-9481-08d88c2ca8c9
X-MS-Exchange-CrossTenant-originalarrivaltime: 19 Nov 2020 01:44:31.0298 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 5ae1af62-9505-4097-a69a-c1553ef7840e
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: D3UmvvA6flDicx31uDHWOsfdrWUXI5jQvIg+msbk3iHm61d5rPzLPTtf98wPVhEG
X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ0PR11MB5182
X-OriginatorOrg: cisco.com
X-Outbound-SMTP-Client: 173.37.102.15, xch-rcd-005.cisco.com
X-Outbound-Node: alln-core-7.cisco.com
X-Mailman-Approved-At: Thu, 19 Nov 2020 04:54:30 +0100
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
X-Content-Filtered-By: Mailman/MimeDel 2.1.15
Subject: [dpdk-dev] ovs-vswitchd with DPDK crashed when guest VM restarts
	network service
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

Hi,
               We are seeing a ovs-vswitchd service crash with segfault in =
the librte_vhost library when a DPDK application within a guest VM is stopp=
ed.

               We are using OVS 2.11.1 on CentOS 7.6 (3.10.0-1062 Linux ker=
nel) with DPDK 18.11.2.

               We are using OVS-DPDK on the host and the guest VM is runnin=
g a DPDK application. With some traffic, if the application service within =
the VM is restarted, then OVS crashes.

               This crash is not seen if the guest VM is restarted (instead=
 of stopping the application within the VM).

               The crash trackback (attached below) points to the rte_memcp=
y_generic() function in rte_memcpy.h. It looks like the crash occurs when v=
host is trying to dequeue the packets from the guest VM (as the application=
 in the guest VM has stopped and the huge pages are returned to the guest k=
ernel).

               We have tried enabling iommu in ovs by setting
"other_config:vhost-iommu-support=3Dtrue" and enabling iommu in qemu using =
the following configuration in the guest domain XML:
<iommu model=3D'intel'>
    <driver intremap=3D'on'/>
</iommu>
               With iommu enabled ovs-vswitchd still crashes when guest VM =
restarts the network service.

               Is this a known problem? Anyone else seen a crash like this?=
  How can we protect the ovs-vswitchd from crashing when a guest VM restart=
s the network application or service?

Thanks
Alex
------------------------------------------------------------------------

Log:
Oct 7 19:54:16 Branch81-Bravo kernel: [2245909.596635] pmd16[25721]: segfau=
lt at 7f4d1d733000 ip 00007f4d2ae5d066 sp 00007f4d1ce65618 error 4 in librt=
e_vhost.so.4[7f4d2ae52000+1a000]
Oct 7 19:54:19 Branch81-Bravo systemd[1]: ovs-vswitchd.service: main proces=
s exited, code=3Dkilled, status=3D11/SEGV

Environment:
CentOs 7.6.1810
openvswitch-2.11.1-1.el7.centos.x86_64
openvswitch-kmod-2.11.1-1.el7.centos.x86_64
dpdk-18.11-2.el7.centos.x86_64
3.10.0-1062.4.1.el7.x86_64
qemu-kvm-ev-2.12.0-18.el7.centos_6.1.1

Core dump trace:
(gdb) bt
#-1 0x00007ffff205602e in rte_memcpy_generic (dst=3D<optimized out>,
src=3D0x7fffcef3607c, n=3D<optimized out>)
at /usr/src/debug/dpdk-18.11/x86_64-native-linuxapp-gcc/include/rte_memcpy.=
h:793
Backtrace stopped: Cannot access memory at address 0x7ffff20558f0

(gdb) list *0x00007ffff205602e
0x7ffff205602e is in rte_memcpy_generic (/usr/src/debug/dpdk-18.11/x86_64-n=
ative-linuxapp-gcc/include/rte_memcpy.h:793).
788 }
789
790 /**
791 * For copy with unaligned load
792 */
793 MOVEUNALIGNED_LEFT47(dst, src, n, srcofs);
794
795 /**
796 * Copy whatever left
797 */

(gdb) list *0x00007ffff205c192
0x7ffff205c192 is in rte_vhost_dequeue_burst (/usr/src/debug/dpdk-18.11/lib=
/librte_vhost/virtio_net.c:1192).
1187 * In zero copy mode, one mbuf can only reference data
1188 * for one or partial of one desc buff.
1189 */
1190 mbuf_avail =3D cpy_len;
1191 } else {
1192 if (likely(cpy_len > MAX_BATCH_LEN ||
1193 vq->batch_copy_nb_elems >=3D vq->size ||
1194 (hdr && cur =3D=3D m))) {
1195 rte_memcpy(rte_pktmbuf_mtod_offset(cur, void *,
1196 mbuf_offset),
(gdb)