From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 8358CA0471 for ; Sun, 14 Jul 2019 07:11:21 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 4D1534CC0; Sun, 14 Jul 2019 07:11:21 +0200 (CEST) Received: from rcdn-iport-5.cisco.com (rcdn-iport-5.cisco.com [173.37.86.76]) by dpdk.org (Postfix) with ESMTP id 44AC44CC0; Sun, 14 Jul 2019 07:11:17 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=4080; q=dns/txt; s=iport; t=1563081078; x=1564290678; h=from:to:cc:subject:date:message-id:references: in-reply-to:content-transfer-encoding:mime-version; bh=tUlGCRx2r3rHSXWlXlhbGGT9k5xge5Vb/sueYr8hK54=; b=Lfl/DAZVQcgWeBc1lwcFx2xRWY/AwKUyIHllwtn5i4b2OQrGFvfQodYm f5LZWe+yhX114BbK2AxbegT8pEZW3JA+Ij2urqvRLW7FXVnpyaiYpjm7C GL7+FTsgg8tJAutNKp1V2tyqspwI6cLkNlgFXweAnkYKef8PweFDbmrUn c=; IronPort-PHdr: =?us-ascii?q?9a23=3AnSZXmB3wnmFfOdEqsmDT+zVfbzU7u7jyIg8e44?= =?us-ascii?q?YmjLQLaKm44pD+JxKHt+51ggrPWoPWo7JfhuzavrqoeFRI4I3J8RVgOIdJSw?= =?us-ascii?q?dDjMwXmwI6B8vQCVz8Kv3ragQxHd9JUxlu+HToeUU=3D?= X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: =?us-ascii?q?A0ANAAAluSpd/5hdJa1mGgEBAQEBAgE?= =?us-ascii?q?BAQEHAgEBAQGBVAQBAQEBCwGBQyknA2pVIAQLKIdjA45NTIIPl0+BLhSBEAN?= =?us-ascii?q?UCQEBAQwBASMKAgEBhEACglYjNQgOAQMBAQQBAQIBBW2FPAyFSgEBAQEDEhU?= =?us-ascii?q?TBgEBKQ4BCwQCAQgRBAEBHgEQMh0IAgQBDQUIGoMBgWoDHQEOnmkCgTiIYIF?= =?us-ascii?q?wM4J5AQEFhQsYghMDBoE0AYteF4FAP4EQAUaCTD6CYQQYgRQBEgEDHiSDFoI?= =?us-ascii?q?mjkmcHgkCghmGWI1Pgi2HJY44jTWHSJAIAgQCBAUCDgEBBYFRATYNHT1xcBW?= =?us-ascii?q?DJ4JBgSYBCYJBilNyAYEojHANFweCJQEB?= X-IronPort-AV: E=Sophos;i="5.63,489,1557187200"; d="scan'208";a="375908148" Received: from rcdn-core-1.cisco.com ([173.37.93.152]) by rcdn-iport-5.cisco.com with ESMTP/TLS/DHE-RSA-SEED-SHA; 14 Jul 2019 05:10:53 +0000 Received: from XCH-ALN-008.cisco.com (xch-aln-008.cisco.com [173.36.7.18]) by rcdn-core-1.cisco.com (8.15.2/8.15.2) with ESMTPS id x6E5ArcR002075 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=FAIL); Sun, 14 Jul 2019 05:10:53 GMT Received: from xhs-rtp-001.cisco.com (64.101.210.228) by XCH-ALN-008.cisco.com (173.36.7.18) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Sun, 14 Jul 2019 00:10:52 -0500 Received: from xhs-aln-001.cisco.com (173.37.135.118) by xhs-rtp-001.cisco.com (64.101.210.228) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Sun, 14 Jul 2019 01:10:52 -0400 Received: from NAM05-CO1-obe.outbound.protection.outlook.com (173.37.151.57) by xhs-aln-001.cisco.com (173.37.135.118) with Microsoft SMTP Server (TLS) id 15.0.1473.3 via Frontend Transport; Sun, 14 Jul 2019 00:10:51 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cisco.onmicrosoft.com; s=selector2-cisco-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=0RoEVRmKg1ElYTdqVcbXpevBlU5Sxn3DcKymwNLQEw0=; b=UUozhoWBR5OaVt5VNwORMPUvOIkFeX+My/ww7bqiwBfixb06mDnAvbUzEc9340s6IW+0UeTE7usdMZ2h3p5B41de7cUwWBL8e6HuEUwhh9K3cPuleqmDOtbKBXRrk8+f5IT/KnDRMO5/iSgq5lCZa+FzROHBNwjOYpRri8fI4bo= Received: from MWHPR11MB1839.namprd11.prod.outlook.com (10.175.53.12) by MWHPR11MB1629.namprd11.prod.outlook.com (10.172.54.149) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2052.19; Sun, 14 Jul 2019 05:10:50 +0000 Received: from MWHPR11MB1839.namprd11.prod.outlook.com ([fe80::3cef:35b5:8800:39f1]) by MWHPR11MB1839.namprd11.prod.outlook.com ([fe80::3cef:35b5:8800:39f1%6]) with mapi id 15.20.2073.012; Sun, 14 Jul 2019 05:10:50 +0000 From: "Hyong Youb Kim (hyonkim)" To: Thomas Monjalon , David Marchand CC: "dev@dpdk.org" , "anatoly.burakov@intel.com" , "alex.williamson@redhat.com" , "maxime.coquelin@redhat.com" , "stephen@networkplumber.org" , "igor.russkikh@aquantia.com" , "pavel.belous@aquantia.com" , "allain.legacy@windriver.com" , "matt.peters@windriver.com" , "ravi1.kumar@amd.com" , "rmody@marvell.com" , "shshaikh@marvell.com" , "ajit.khaparde@broadcom.com" , "somnath.kotur@broadcom.com" , "hemant.agrawal@nxp.com" , "shreyansh.jain@nxp.com" , "wenzhuo.lu@intel.com" , "mw@semihalf.com" , "mk@semihalf.com" , "gtzalik@amazon.com" , "evgenys@amazon.com" , "John Daley (johndale)" , "qi.z.zhang@intel.com" , "xiao.w.wang@intel.com" , "xuanziyang2@huawei.com" , "cloud.wangxiaoyun@huawei.com" , "zhouguoyang@huawei.com" , "beilei.xing@intel.com" , "jingjing.wu@intel.com" , "qiming.yang@intel.com" , "konstantin.ananyev@intel.com" , "alejandro.lucero@netronome.com" , "arybchenko@solarflare.com" , "tiwei.bie@intel.com" , "zhihong.wang@intel.com" , "yongwang@vmware.com" , "stable@dpdk.org" Thread-Topic: [dpdk-dev] [PATCH] vfio: fix interrupts race condition Thread-Index: AQHVNxvQ/aKE7NGAcUykS+n6I1o+uabEXHuAgAUb5sA= Date: Sun, 14 Jul 2019 05:10:49 +0000 Message-ID: References: <1562071706-11009-1-git-send-email-david.marchand@redhat.com> <1562762020-8259-1-git-send-email-david.marchand@redhat.com> <1796500.5oFe8j95cd@xps> In-Reply-To: <1796500.5oFe8j95cd@xps> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=hyonkim@cisco.com; x-originating-ip: [2001:420:c0dc:1001::40] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 03c62d9e-7bc5-419e-8302-08d70819a329 x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(7020095)(4652040)(8989299)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(5600148)(711020)(4605104)(1401327)(2017052603328)(7193020); SRVR:MWHPR11MB1629; x-ms-traffictypediagnostic: MWHPR11MB1629: x-ms-exchange-purlcount: 1 x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:10000; x-forefront-prvs: 0098BA6C6C x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(4636009)(396003)(346002)(366004)(136003)(39860400002)(376002)(189003)(199004)(13464003)(14454004)(68736007)(52536014)(5660300002)(76116006)(66476007)(66556008)(64756008)(71190400001)(66946007)(66446008)(46003)(25786009)(110136005)(54906003)(8936002)(4326008)(86362001)(316002)(6116002)(53936002)(6506007)(7696005)(305945005)(74316002)(476003)(256004)(6246003)(7736002)(186003)(81166006)(478600001)(229853002)(966005)(45080400002)(76176011)(8676002)(81156014)(55016002)(6436002)(7406005)(7416002)(6306002)(9686003)(71200400001)(2906002)(33656002)(14444005)(446003)(11346002)(99286004)(53546011)(102836004)(486006); DIR:OUT; SFP:1101; SCL:1; SRVR:MWHPR11MB1629; H:MWHPR11MB1839.namprd11.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1; received-spf: None (protection.outlook.com: cisco.com does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 x-microsoft-antispam-message-info: DDVPdwvVeUoQkV51O0uI1C/rMaKwWZO6wp2ZYbB1qjIkG7Tb3V0xqJXdaEFzCQwMfEEylbTxuQzeSM1ZhbEhdmRkc75lZRC61bsz31PopwRNRcnxXSsbAPjvDP3Pfkq5ghu5kN1Y4m1u7GqS3X2AAextXBxrfJyK2KVmhZ5vLHn0Kt3A6nt3PaFpj88+nf0R6EfnrBdf/H2dNd6Uwo1Kphd9HJxrg2DdulEQma5hAwYBUAmQ2u87ahHBA2WwwMTiIGCMBhCIuJraaAZbW80J1/v49z7o9k5+ntXQY+E2yddnMBI4xdYM4gKqRW+cCfzGw14CAD5U55mANuEg81cHBZN1ajm/gFwZg+Zzn2Q782uC5mUwQ8f+5lqTn2UI+75EWXKRVZxW+k4mUQ5lpMeF6oKftCcWpMzYf7VnOyaG5cY= Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-CrossTenant-Network-Message-Id: 03c62d9e-7bc5-419e-8302-08d70819a329 X-MS-Exchange-CrossTenant-originalarrivaltime: 14 Jul 2019 05:10:49.9764 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 5ae1af62-9505-4097-a69a-c1553ef7840e X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: hyonkim@cisco.com X-MS-Exchange-Transport-CrossTenantHeadersStamped: MWHPR11MB1629 X-OriginatorOrg: cisco.com X-Outbound-SMTP-Client: 173.36.7.18, xch-aln-008.cisco.com X-Outbound-Node: rcdn-core-1.cisco.com Subject: Re: [dpdk-stable] [dpdk-dev] [PATCH] vfio: fix interrupts race condition X-BeenThere: stable@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches for DPDK stable branches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: stable-bounces@dpdk.org Sender: "stable" > -----Original Message----- > From: Thomas Monjalon > Sent: Thursday, July 11, 2019 6:21 AM [...] > Subject: Re: [dpdk-dev] [PATCH] vfio: fix interrupts race condition >=20 > 10/07/2019 14:33, David Marchand: > > Populating the eventfd in rte_intr_enable in each request to vfio > > triggers a reconfiguration of the interrupt handler on the kernel side. > > The problem is that rte_intr_enable is often used to re-enable masked > > interrupts from drivers interrupt handlers. > > > > This reconfiguration leaves a window during which a device could send > > an interrupt and then the kernel logs this (unsolicited from the kernel > > point of view) interrupt: > > [158764.159833] do_IRQ: 9.34 No irq handler for vector > > > > VFIO api makes it possible to set the fd at setup time. > > Make use of this and then we only need to ask for masking/unmasking > > legacy interrupts and we have nothing to do for MSI/MSIX. > > > > "rxtx" interrupts are left untouched but are most likely subject to the > > same issue. > > > > Fixes: 5c782b3928b8 ("vfio: interrupts") > > Cc: stable@dpdk.org > > > > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=3D1654824 > > Signed-off-by: David Marchand > > Tested-by: Shahed Shaikh >=20 > This is a real bug which should be fixed in this release. > As the patch is quite big and needs a strong validation, > I prefer merging it quickly to give a lot of time before > releasing 19.08-rc2. > The maintainers of all concerned PMDs are Cc. > Please make sure the interrupts are still working well with VFIO. >=20 > Applied, thanks >=20 [Apologies in advance if email format gets messed up. Forced to use outlook for the first time..] Hi, This commit breaks MSI-X + rxq interrupts. I think others are seeing the same error? sudo ~/dpdk/examples/l3fwd-power/build/l3fwd-power \ -c 0x1e -n 4 -w 0000:1a:00.0 --log-level=3Dpmd,debug -- -p 0x1 -P --config = "(0,0,2),(0,1,3),(0,2,4)" [...] EAL: Error enabling MSI-X interrupts for fd 35 A rough sequence of events goes like this. The above test is using 3 rxqs (3 interrupts). 1. During probe, pci_vfio_setup_interrupts() runs. This now does ioctl(VFIO_DEVICE_SET_IRQS) for the 1st efd (intr_handle->fd). ioctl does: - pci_enable_msix(1 vector) because this is the first time enabling interrupts. - request_irq(vector 0) 2. App configs The app sets port_conf.intr_conf.rxq=3D1, configs 3 rxqs, etc. 3. rte_eth_dev_start() PMD calls: - rte_intr_efd_enable() This creates 3 efds (intr_handle->nb_efd =3D 3). - rte_intr_enable() =3D> vfio_enable_msix() This does ioctl(VFIO_DEVICE_SET_IRQS) for the 3 efds. ioctl now needs to request_irq() for vectors 1, 2, 3 for the 3 new efds. It does not do another pci_enable_msix() as it has been done earlier. Before calling request_irq(), it sees that only 1 vector was enabled in earlier pci_enable_msix(), so it fails with EINVAL. We would need pci_enable_msix(4 vectors) for this to work (intr_handle->fd + 3 efds). Prior to this patch, VFIO_DEVICE_SET_IRQS is done only in vfio_enable_msix(). So, ioctl ends up doing pci_enable_msix(4 vectors) and request_irq() for each of the 4 efds, which completes successfully. Not an expert in this area.. Perhaps, defer enabling 1st efd (intr_handle->fd) until the first invocation of vfio_enable_msix(), so it knows the app wants to use 4 vectors in total? Also, vfio_disable_msix() looks a bit wrong. irq_set.flags =3D VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_TRIG= GER; irq_set.index =3D VFIO_PCI_MSIX_IRQ_INDEX; irq_set.start =3D RTE_INTR_VEC_RXTX_OFFSET; irq_set.count =3D intr_handle->nb_efd; This tells vfio-pci to simulate interrupts by triggering efds? To free_irq() specific efds, I think we need DATA_EVENTFD and set fd =3D -1. flags =3D DATA_EVENTFD | ACTION_TRIGGER data =3D [fd(-1), fd(-1), ...] I have not tested this part myself yet. Thanks.. -Hyong