From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from dpdk.org (dpdk.org [92.243.14.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 41242A0471
	for <public@inbox.dpdk.org>; Sun, 14 Jul 2019 13:20:00 +0200 (CEST)
Received: from [92.243.14.124] (localhost [127.0.0.1])
	by dpdk.org (Postfix) with ESMTP id 76ACE1DBE;
	Sun, 14 Jul 2019 13:19:59 +0200 (CEST)
Received: from new3-smtp.messagingengine.com (new3-smtp.messagingengine.com
 [66.111.4.229]) by dpdk.org (Postfix) with ESMTP id B3CFE23D;
 Sun, 14 Jul 2019 13:19:58 +0200 (CEST)
Received: from compute1.internal (compute1.nyi.internal [10.202.2.41])
 by mailnew.nyi.internal (Postfix) with ESMTP id 5F8A7234D;
 Sun, 14 Jul 2019 07:19:55 -0400 (EDT)
Received: from mailfrontend2 ([10.202.2.163])
 by compute1.internal (MEProxy); Sun, 14 Jul 2019 07:19:55 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=monjalon.net; h=
 from:to:cc:subject:date:message-id:in-reply-to:references
 :mime-version:content-transfer-encoding:content-type; s=mesmtp;
 bh=tn5SGY8nJRTTOokranz6IOwlWrArbZkbzCsG6YLaeSA=; b=NNWNfqCJSwsb
 H+SV2s3pmKkPI+5NR2HwqupBY5D0V2CpIPKrzbMStRd8zluGpJ7F5DFIqF9DS5U9
 otfKt7uvlN/oe5prI1IGlVQeFU+xuc2qWE2Ej9vcpINtIy8fNwfwjcb5LsEkRtTh
 ukC84mkKAJeeFk32IIULYu1VPcH+VuY=
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
 messagingengine.com; h=cc:content-transfer-encoding:content-type
 :date:from:in-reply-to:message-id:mime-version:references
 :subject:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender
 :x-sasl-enc; s=fm3; bh=tn5SGY8nJRTTOokranz6IOwlWrArbZkbzCsG6YLae
 SA=; b=CG+2znxflNQFOkZVuy2z23QltfEumCi8WmCV0N5m51zgyb2rUkpagZy/x
 obTb09S2DabDCbF5X/53hxO05TCuyV7DhJM9CFi1jZjO3noFbzq9LdX/r4RZHFXr
 vxcFYkLzw06qKkff73wLfNwDfnL7SYAU7oqWdjHFtwARJ+bBcHzicPKTo3reJm8+
 7JAo8nZQxFztlooM1nXyaCxjrV0B9dyqzwdUGCeUb2m3SFJuoDeckaueUqEc5h1U
 h3377DRrePCithEUt5KTTdy67gQ9SZSg8Ct3AdWYoDen1BQijPBiruY3ACIe5pFG
 GdtrxwyjmMI2754rkxXNmhnAWkm0A==
X-ME-Sender: <xms:2A8rXfLUEqIV3ar3uD7VZ-9PzK8U5RC9z4ARO84xTQIavWCt0za9aQ>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduvddrheehgdegtdcutefuodetggdotefrodftvf
 curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu
 uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc
 fjughrpefhvffufffkjghfggfgtgesthfuredttddtvdenucfhrhhomhepvfhhohhmrghs
 ucfoohhnjhgrlhhonhcuoehthhhomhgrshesmhhonhhjrghlohhnrdhnvghtqeenucffoh
 hmrghinheprhgvughhrghtrdgtohhmnecukfhppeejjedrudefgedrvddtfedrudekgeen
 ucfrrghrrghmpehmrghilhhfrhhomhepthhhohhmrghssehmohhnjhgrlhhonhdrnhgvth
 enucevlhhushhtvghrufhiiigvpedt
X-ME-Proxy: <xmx:2A8rXekotxcqLGS2jDriNlOVrxWwWPqqJEk4LfQRKKRCUUPo61AAyQ>
 <xmx:2A8rXWbnerywev5BOOzsmNkqp1jSTA4XI5qzdDJn142Md3W5Wxodfg>
 <xmx:2A8rXVhZ5MoNbKmhWQfpZtokiV2mDY2pvc0L_iIKWEFxf3Q0LeO7gQ>
 <xmx:2w8rXdKQzbuMMYUN4CdbzbiiMW6x6wAoOn2nEiMgW6q-JI5SulwqKg>
Received: from xps.localnet (184.203.134.77.rev.sfr.net [77.134.203.184])
 by mail.messagingengine.com (Postfix) with ESMTPA id 2AD44380075;
 Sun, 14 Jul 2019 07:19:49 -0400 (EDT)
From: Thomas Monjalon <thomas@monjalon.net>
To: "Hyong Youb Kim (hyonkim)" <hyonkim@cisco.com>
Cc: David Marchand <david.marchand@redhat.com>, "dev@dpdk.org" <dev@dpdk.org>,
 "anatoly.burakov@intel.com" <anatoly.burakov@intel.com>,
 "alex.williamson@redhat.com" <alex.williamson@redhat.com>,
 "maxime.coquelin@redhat.com" <maxime.coquelin@redhat.com>,
 "stephen@networkplumber.org" <stephen@networkplumber.org>,
 "igor.russkikh@aquantia.com" <igor.russkikh@aquantia.com>,
 "pavel.belous@aquantia.com" <pavel.belous@aquantia.com>,
 "allain.legacy@windriver.com" <allain.legacy@windriver.com>,
 "matt.peters@windriver.com" <matt.peters@windriver.com>,
 "ravi1.kumar@amd.com" <ravi1.kumar@amd.com>,
 "rmody@marvell.com" <rmody@marvell.com>,
 "shshaikh@marvell.com" <shshaikh@marvell.com>,
 "ajit.khaparde@broadcom.com" <ajit.khaparde@broadcom.com>,
 "somnath.kotur@broadcom.com" <somnath.kotur@broadcom.com>,
 "hemant.agrawal@nxp.com" <hemant.agrawal@nxp.com>,
 "shreyansh.jain@nxp.com" <shreyansh.jain@nxp.com>,
 "wenzhuo.lu@intel.com" <wenzhuo.lu@intel.com>,
 "mw@semihalf.com" <mw@semihalf.com>, "mk@semihalf.com" <mk@semihalf.com>,
 "gtzalik@amazon.com" <gtzalik@amazon.com>,
 "evgenys@amazon.com" <evgenys@amazon.com>,
 "John Daley (johndale)" <johndale@cisco.com>,
 "qi.z.zhang@intel.com" <qi.z.zhang@intel.com>,
 "xiao.w.wang@intel.com" <xiao.w.wang@intel.com>,
 "xuanziyang2@huawei.com" <xuanziyang2@huawei.com>,
 "cloud.wangxiaoyun@huawei.com" <cloud.wangxiaoyun@huawei.com>,
 "zhouguoyang@huawei.com" <zhouguoyang@huawei.com>,
 "beilei.xing@intel.com" <beilei.xing@intel.com>,
 "jingjing.wu@intel.com" <jingjing.wu@intel.com>,
 "qiming.yang@intel.com" <qiming.yang@intel.com>,
 "konstantin.ananyev@intel.com" <konstantin.ananyev@intel.com>,
 "alejandro.lucero@netronome.com" <alejandro.lucero@netronome.com>,
 "arybchenko@solarflare.com" <arybchenko@solarflare.com>,
 "tiwei.bie@intel.com" <tiwei.bie@intel.com>,
 "zhihong.wang@intel.com" <zhihong.wang@intel.com>,
 "yongwang@vmware.com" <yongwang@vmware.com>,
 "stable@dpdk.org" <stable@dpdk.org>
Date: Sun, 14 Jul 2019 13:19:45 +0200
Message-ID: <4647179.CTOrK8BQiK@xps>
In-Reply-To: <MWHPR11MB18392141DEEF5C77A761FAD0BFCC0@MWHPR11MB1839.namprd11.prod.outlook.com>
References: <1562071706-11009-1-git-send-email-david.marchand@redhat.com>
 <1796500.5oFe8j95cd@xps>
 <MWHPR11MB18392141DEEF5C77A761FAD0BFCC0@MWHPR11MB1839.namprd11.prod.outlook.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 7Bit
Content-Type: text/plain; charset="us-ascii"
Subject: Re: [dpdk-dev] [PATCH] vfio: fix interrupts race condition
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

14/07/2019 07:10, Hyong Youb Kim (hyonkim):
> > -----Original Message-----
> > From: Thomas Monjalon <thomas@monjalon.net>
> > Sent: Thursday, July 11, 2019 6:21 AM
> [...]
> > Subject: Re: [dpdk-dev] [PATCH] vfio: fix interrupts race condition
> > 
> > 10/07/2019 14:33, David Marchand:
> > > Populating the eventfd in rte_intr_enable in each request to vfio
> > > triggers a reconfiguration of the interrupt handler on the kernel side.
> > > The problem is that rte_intr_enable is often used to re-enable masked
> > > interrupts from drivers interrupt handlers.
> > >
> > > This reconfiguration leaves a window during which a device could send
> > > an interrupt and then the kernel logs this (unsolicited from the kernel
> > > point of view) interrupt:
> > > [158764.159833] do_IRQ: 9.34 No irq handler for vector
> > >
> > > VFIO api makes it possible to set the fd at setup time.
> > > Make use of this and then we only need to ask for masking/unmasking
> > > legacy interrupts and we have nothing to do for MSI/MSIX.
> > >
> > > "rxtx" interrupts are left untouched but are most likely subject to the
> > > same issue.
> > >
> > > Fixes: 5c782b3928b8 ("vfio: interrupts")
> > > Cc: stable@dpdk.org
> > >
> > > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1654824
> > > Signed-off-by: David Marchand <david.marchand@redhat.com>
> > > Tested-by: Shahed Shaikh <shshaikh@marvell.com>
> > 
> > This is a real bug which should be fixed in this release.
> > As the patch is quite big and needs a strong validation,
> > I prefer merging it quickly to give a lot of time before
> > releasing 19.08-rc2.
> > The maintainers of all concerned PMDs are Cc.
> > Please make sure the interrupts are still working well with VFIO.
> > 
> > Applied, thanks
> > 
> 
> [Apologies in advance if email format gets messed up. Forced to use
> outlook for the first time..]
> 
> Hi,
> 
> This commit breaks MSI-X + rxq interrupts. I think others are seeing
> the same error?
> 
> sudo ~/dpdk/examples/l3fwd-power/build/l3fwd-power \
> -c 0x1e -n 4 -w 0000:1a:00.0 --log-level=pmd,debug -- -p 0x1 -P --config "(0,0,2),(0,1,3),(0,2,4)"
> [...]
> EAL: Error enabling MSI-X interrupts for fd 35
> 
> A rough sequence of events goes like this. The above test is using 3
> rxqs (3 interrupts).
> 
> 1. During probe, pci_vfio_setup_interrupts() runs.
> This now does ioctl(VFIO_DEVICE_SET_IRQS) for the 1st efd
> (intr_handle->fd).
> 
> ioctl does:
> - pci_enable_msix(1 vector) because this is the first time enabling
>   interrupts.
> - request_irq(vector 0)
> 
> 2. App configs
> The app sets port_conf.intr_conf.rxq=1, configs 3 rxqs, etc.
> 
> 3. rte_eth_dev_start()
> PMD calls:
> - rte_intr_efd_enable()
>   This creates 3 efds (intr_handle->nb_efd = 3).
> - rte_intr_enable() => vfio_enable_msix()
>   This does ioctl(VFIO_DEVICE_SET_IRQS) for the 3 efds.
> 
> ioctl now needs to request_irq() for vectors 1, 2, 3 for the 3 new
> efds. It does not do another pci_enable_msix() as it has been done
> earlier. Before calling request_irq(), it sees that only 1 vector was
> enabled in earlier pci_enable_msix(), so it fails with EINVAL.
> 
> We would need pci_enable_msix(4 vectors) for this to work
> (intr_handle->fd + 3 efds).
> 
> Prior to this patch, VFIO_DEVICE_SET_IRQS is done only in
> vfio_enable_msix(). So, ioctl ends up doing pci_enable_msix(4 vectors)
> and request_irq() for each of the 4 efds, which completes
> successfully.
> 
> Not an expert in this area.. Perhaps, defer enabling 1st efd
> (intr_handle->fd) until the first invocation of vfio_enable_msix(), so
> it knows the app wants to use 4 vectors in total?
> 
> Also, vfio_disable_msix() looks a bit wrong.
> 
>         irq_set.flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_TRIGGER;
>         irq_set.index = VFIO_PCI_MSIX_IRQ_INDEX;
>         irq_set.start = RTE_INTR_VEC_RXTX_OFFSET;
>         irq_set.count = intr_handle->nb_efd;
> 
> This tells vfio-pci to simulate interrupts by triggering efds? To
> free_irq() specific efds, I think we need DATA_EVENTFD and set fd =
> -1.
> 
> flags = DATA_EVENTFD | ACTION_TRIGGER
> data = [fd(-1), fd(-1), ...]
> 
> I have not tested this part myself yet.

Thanks for your detailed report Hyong.
Would you be able to propose a fix?