From: "Hyong Youb Kim (hyonkim)" <hyonkim@cisco.com>
To: Thomas Monjalon <thomas@monjalon.net>,
David Marchand <david.marchand@redhat.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>,
"anatoly.burakov@intel.com" <anatoly.burakov@intel.com>,
"alex.williamson@redhat.com" <alex.williamson@redhat.com>,
"maxime.coquelin@redhat.com" <maxime.coquelin@redhat.com>,
"stephen@networkplumber.org" <stephen@networkplumber.org>,
"igor.russkikh@aquantia.com" <igor.russkikh@aquantia.com>,
"pavel.belous@aquantia.com" <pavel.belous@aquantia.com>,
"allain.legacy@windriver.com" <allain.legacy@windriver.com>,
"matt.peters@windriver.com" <matt.peters@windriver.com>,
"ravi1.kumar@amd.com" <ravi1.kumar@amd.com>,
"rmody@marvell.com" <rmody@marvell.com>,
"shshaikh@marvell.com" <shshaikh@marvell.com>,
"ajit.khaparde@broadcom.com" <ajit.khaparde@broadcom.com>,
"somnath.kotur@broadcom.com" <somnath.kotur@broadcom.com>,
"hemant.agrawal@nxp.com" <hemant.agrawal@nxp.com>,
"shreyansh.jain@nxp.com" <shreyansh.jain@nxp.com>,
"wenzhuo.lu@intel.com" <wenzhuo.lu@intel.com>,
"mw@semihalf.com" <mw@semihalf.com>,
"mk@semihalf.com" <mk@semihalf.com>,
"gtzalik@amazon.com" <gtzalik@amazon.com>,
"evgenys@amazon.com" <evgenys@amazon.com>,
"John Daley (johndale)" <johndale@cisco.com>,
"qi.z.zhang@intel.com" <qi.z.zhang@intel.com>,
"xiao.w.wang@intel.com" <xiao.w.wang@intel.com>,
"xuanziyang2@huawei.com" <xuanziyang2@huawei.com>,
"cloud.wangxiaoyun@huawei.com" <cloud.wangxiaoyun@huawei.com>,
"zhouguoyang@huawei.com" <zhouguoyang@huawei.com>,
"beilei.xing@intel.com" <beilei.xing@intel.com>,
"jingjing.wu@intel.com" <jingjing.wu@intel.com>,
"qiming.yang@intel.com" <qiming.yang@intel.com>,
"konstantin.ananyev@intel.com" <konstantin.ananyev@intel.com>,
"alejandro.lucero@netronome.com" <alejandro.lucero@netronome.com>,
"arybchenko@solarflare.com" <arybchenko@solarflare.com>,
"tiwei.bie@intel.com" <tiwei.bie@intel.com>,
"zhihong.wang@intel.com" <zhihong.wang@intel.com>,
"yongwang@vmware.com" <yongwang@vmware.com>,
"stable@dpdk.org" <stable@dpdk.org>
Subject: Re: [dpdk-stable] [dpdk-dev] [PATCH] vfio: fix interrupts race condition
Date: Sun, 14 Jul 2019 05:10:49 +0000 [thread overview]
Message-ID: <MWHPR11MB18392141DEEF5C77A761FAD0BFCC0@MWHPR11MB1839.namprd11.prod.outlook.com> (raw)
In-Reply-To: <1796500.5oFe8j95cd@xps>
> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Thursday, July 11, 2019 6:21 AM
[...]
> Subject: Re: [dpdk-dev] [PATCH] vfio: fix interrupts race condition
>
> 10/07/2019 14:33, David Marchand:
> > Populating the eventfd in rte_intr_enable in each request to vfio
> > triggers a reconfiguration of the interrupt handler on the kernel side.
> > The problem is that rte_intr_enable is often used to re-enable masked
> > interrupts from drivers interrupt handlers.
> >
> > This reconfiguration leaves a window during which a device could send
> > an interrupt and then the kernel logs this (unsolicited from the kernel
> > point of view) interrupt:
> > [158764.159833] do_IRQ: 9.34 No irq handler for vector
> >
> > VFIO api makes it possible to set the fd at setup time.
> > Make use of this and then we only need to ask for masking/unmasking
> > legacy interrupts and we have nothing to do for MSI/MSIX.
> >
> > "rxtx" interrupts are left untouched but are most likely subject to the
> > same issue.
> >
> > Fixes: 5c782b3928b8 ("vfio: interrupts")
> > Cc: stable@dpdk.org
> >
> > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1654824
> > Signed-off-by: David Marchand <david.marchand@redhat.com>
> > Tested-by: Shahed Shaikh <shshaikh@marvell.com>
>
> This is a real bug which should be fixed in this release.
> As the patch is quite big and needs a strong validation,
> I prefer merging it quickly to give a lot of time before
> releasing 19.08-rc2.
> The maintainers of all concerned PMDs are Cc.
> Please make sure the interrupts are still working well with VFIO.
>
> Applied, thanks
>
[Apologies in advance if email format gets messed up. Forced to use
outlook for the first time..]
Hi,
This commit breaks MSI-X + rxq interrupts. I think others are seeing
the same error?
sudo ~/dpdk/examples/l3fwd-power/build/l3fwd-power \
-c 0x1e -n 4 -w 0000:1a:00.0 --log-level=pmd,debug -- -p 0x1 -P --config "(0,0,2),(0,1,3),(0,2,4)"
[...]
EAL: Error enabling MSI-X interrupts for fd 35
A rough sequence of events goes like this. The above test is using 3
rxqs (3 interrupts).
1. During probe, pci_vfio_setup_interrupts() runs.
This now does ioctl(VFIO_DEVICE_SET_IRQS) for the 1st efd
(intr_handle->fd).
ioctl does:
- pci_enable_msix(1 vector) because this is the first time enabling
interrupts.
- request_irq(vector 0)
2. App configs
The app sets port_conf.intr_conf.rxq=1, configs 3 rxqs, etc.
3. rte_eth_dev_start()
PMD calls:
- rte_intr_efd_enable()
This creates 3 efds (intr_handle->nb_efd = 3).
- rte_intr_enable() => vfio_enable_msix()
This does ioctl(VFIO_DEVICE_SET_IRQS) for the 3 efds.
ioctl now needs to request_irq() for vectors 1, 2, 3 for the 3 new
efds. It does not do another pci_enable_msix() as it has been done
earlier. Before calling request_irq(), it sees that only 1 vector was
enabled in earlier pci_enable_msix(), so it fails with EINVAL.
We would need pci_enable_msix(4 vectors) for this to work
(intr_handle->fd + 3 efds).
Prior to this patch, VFIO_DEVICE_SET_IRQS is done only in
vfio_enable_msix(). So, ioctl ends up doing pci_enable_msix(4 vectors)
and request_irq() for each of the 4 efds, which completes
successfully.
Not an expert in this area.. Perhaps, defer enabling 1st efd
(intr_handle->fd) until the first invocation of vfio_enable_msix(), so
it knows the app wants to use 4 vectors in total?
Also, vfio_disable_msix() looks a bit wrong.
irq_set.flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_TRIGGER;
irq_set.index = VFIO_PCI_MSIX_IRQ_INDEX;
irq_set.start = RTE_INTR_VEC_RXTX_OFFSET;
irq_set.count = intr_handle->nb_efd;
This tells vfio-pci to simulate interrupts by triggering efds? To
free_irq() specific efds, I think we need DATA_EVENTFD and set fd =
-1.
flags = DATA_EVENTFD | ACTION_TRIGGER
data = [fd(-1), fd(-1), ...]
I have not tested this part myself yet.
Thanks..
-Hyong
next prev parent reply other threads:[~2019-07-14 5:11 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1562071706-11009-1-git-send-email-david.marchand@redhat.com>
2019-07-10 12:33 ` [dpdk-stable] " David Marchand
2019-07-10 21:20 ` [dpdk-stable] [dpdk-dev] " Thomas Monjalon
2019-07-14 5:10 ` Hyong Youb Kim (hyonkim) [this message]
2019-07-14 11:19 ` Thomas Monjalon
2019-07-15 5:35 ` Jerin Jacob Kollanukkaran
2019-07-15 7:20 ` Hyong Youb Kim (hyonkim)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=MWHPR11MB18392141DEEF5C77A761FAD0BFCC0@MWHPR11MB1839.namprd11.prod.outlook.com \
--to=hyonkim@cisco.com \
--cc=ajit.khaparde@broadcom.com \
--cc=alejandro.lucero@netronome.com \
--cc=alex.williamson@redhat.com \
--cc=allain.legacy@windriver.com \
--cc=anatoly.burakov@intel.com \
--cc=arybchenko@solarflare.com \
--cc=beilei.xing@intel.com \
--cc=cloud.wangxiaoyun@huawei.com \
--cc=david.marchand@redhat.com \
--cc=dev@dpdk.org \
--cc=evgenys@amazon.com \
--cc=gtzalik@amazon.com \
--cc=hemant.agrawal@nxp.com \
--cc=igor.russkikh@aquantia.com \
--cc=jingjing.wu@intel.com \
--cc=johndale@cisco.com \
--cc=konstantin.ananyev@intel.com \
--cc=matt.peters@windriver.com \
--cc=maxime.coquelin@redhat.com \
--cc=mk@semihalf.com \
--cc=mw@semihalf.com \
--cc=pavel.belous@aquantia.com \
--cc=qi.z.zhang@intel.com \
--cc=qiming.yang@intel.com \
--cc=ravi1.kumar@amd.com \
--cc=rmody@marvell.com \
--cc=shreyansh.jain@nxp.com \
--cc=shshaikh@marvell.com \
--cc=somnath.kotur@broadcom.com \
--cc=stable@dpdk.org \
--cc=stephen@networkplumber.org \
--cc=thomas@monjalon.net \
--cc=tiwei.bie@intel.com \
--cc=wenzhuo.lu@intel.com \
--cc=xiao.w.wang@intel.com \
--cc=xuanziyang2@huawei.com \
--cc=yongwang@vmware.com \
--cc=zhihong.wang@intel.com \
--cc=zhouguoyang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).