From: "Michael S. Tsirkin" <mst@redhat.com>
To: Stephen Hemminger <stephen@networkplumber.org>
Cc: dev@dpdk.org, hjk@hansjkoch.de, gregkh@linux-foundation.org,
linux-kernel@vger.kernel.org
Subject: Re: [dpdk-dev] [PATCH 2/2] uio: new driver to support PCI MSI-X
Date: Thu, 1 Oct 2015 18:22:51 +0300 [thread overview]
Message-ID: <20151001152251.GA25009@redhat.com> (raw)
In-Reply-To: <20151001075037.61c43f63@urahara>
On Thu, Oct 01, 2015 at 07:50:37AM -0700, Stephen Hemminger wrote:
> On Thu, 1 Oct 2015 11:33:06 +0300
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
>
> > On Wed, Sep 30, 2015 at 03:28:58PM -0700, Stephen Hemminger wrote:
> > > This driver allows using PCI device with Message Signalled Interrupt
> > > from userspace. The API is similar to the igb_uio driver used by the DPDK.
> > > Via ioctl it provides a mechanism to map MSI-X interrupts into event
> > > file descriptors similar to VFIO.
> > >
> > > VFIO is a better choice if IOMMU is available, but often userspace drivers
> > > have to work in environments where IOMMU support (real or emulated) is
> > > not available. All UIO drivers that support DMA are not secure against
> > > rogue userspace applications programming DMA hardware to access
> > > private memory; this driver is no less secure than existing code.
> > >
> > > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> >
> > I don't think copying the igb_uio interface is a good idea.
> > What DPDK is doing with igb_uio (and indeed uio_pci_generic)
> > is abusing the sysfs BAR access to provide unlimited
> > access to hardware.
> >
> > MSI messages are memory writes so any generic device capable
> > of MSI is capable of corrupting kernel memory.
> > This means that a bug in userspace will lead to kernel memory corruption
> > and crashes. This is something distributions can't support.
> >
> > uio_pci_generic is already abused like that, mostly
> > because when I wrote it, I didn't add enough protections
> > against using it with DMA capable devices,
> > and we can't go back and break working userspace.
> > But at least it does not bind to VFs which all of
> > them are capable of DMA.
> >
> > The result of merging this driver will be userspace abusing the
> > sysfs BAR access with VFs as well, and we do not want that.
> >
> >
> > Just forwarding events is not enough to make a valid driver.
> > What is missing is a way to access the device in a safe way.
> >
> > On a more positive note:
> >
> > What would be a reasonable interface? One that does the following
> > in kernel:
> >
> > 1. initializes device rings (can be in pinned userspace memory,
> > but can not be writeable by userspace), brings up interface link
> > 2. pins userspace memory (unless using e.g. hugetlbfs)
> > 3. gets request, make sure it's valid and belongs to
> > the correct task, put it in the ring
> > 4. in the reverse direction, notify userspace when buffers
> > are available in the ring
> > 5. notify userspace about MSI (what this driver does)
> >
> > What userspace can be allowed to do:
> >
> > format requests (e.g. transmit, receive) in userspace
> > read ring contents
> >
> > What userspace can't be allowed to do:
> >
> > access BAR
> > write rings
> >
> >
> > This means that the driver can not be a generic one,
> > and there will be a system call overhead when you
> > write the ring, but that's the price you have to
> > pay for ability to run on systems without an IOMMU.
>
> I think I understand what you are proposing, but it really doesn't
> fit into the high speed userspace networking model.
I'm aware of the fact currently the model does everything including
bringing up the link in user-space.
But there's really no justification for this.
Only data path things should be in userspace.
A userspace bug should not be able to do things like over-writing the
on-device EEPROM.
> 1. Device rings are device specific, can't be in a generic driver.
So that's more work, and it is not going to happen if people
can get by with insecure hacks.
> 2. DPDK uses huge mememory.
Hugetlbfs? Don't see why this is an issue. Might make things simpler.
> 3. Performance requires all ring requests be done in pure userspace,
> (ie no syscalls)
Make only the TX ring writeable then. At least you won't be
able to corrupt the kernel memory.
> 4. Ditto, can't have kernel to userspace notification per packet
RX ring can be read-only, so userspace can read it directly.
--
MST
next prev parent reply other threads:[~2015-10-01 15:22 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-30 22:28 [dpdk-dev] [PATCH 0/2] uio_msi: device driver Stephen Hemminger
2015-09-30 22:28 ` [dpdk-dev] [PATCH 1/2] uio: add support for ioctls Stephen Hemminger
2015-09-30 22:28 ` [dpdk-dev] [PATCH 2/2] uio: new driver to support PCI MSI-X Stephen Hemminger
2015-10-01 8:33 ` Michael S. Tsirkin
2015-10-01 10:37 ` Michael S. Tsirkin
2015-10-01 16:06 ` Michael S. Tsirkin
2015-10-01 14:50 ` Stephen Hemminger
2015-10-01 15:22 ` Michael S. Tsirkin [this message]
2015-10-01 16:31 ` Michael S. Tsirkin
2015-10-01 17:26 ` Stephen Hemminger
2015-10-01 18:25 ` Michael S. Tsirkin
2015-10-05 21:54 ` Michael S. Tsirkin
2015-10-05 22:09 ` Vladislav Zolotarov
2015-10-05 22:49 ` Michael S. Tsirkin
2015-10-06 7:33 ` Stephen Hemminger
2015-10-06 12:15 ` Avi Kivity
2015-10-06 14:07 ` Michael S. Tsirkin
2015-10-06 15:41 ` Avi Kivity
2015-10-16 17:11 ` Thomas Monjalon
2015-10-16 17:20 ` Stephen Hemminger
2015-10-06 13:42 ` Michael S. Tsirkin
2015-10-06 8:23 ` Vlad Zolotarov
2015-10-06 13:58 ` Michael S. Tsirkin
2015-10-06 14:49 ` Vlad Zolotarov
2015-10-06 15:00 ` Michael S. Tsirkin
2015-10-06 16:40 ` Vlad Zolotarov
2015-10-01 23:40 ` Alexander Duyck
2015-10-02 0:01 ` Stephen Hemminger
2015-10-02 1:21 ` Alexander Duyck
2015-10-02 0:04 ` Stephen Hemminger
2015-10-02 2:33 ` Alexander Duyck
2015-10-01 8:36 ` [dpdk-dev] [PATCH 0/2] uio_msi: device driver Michael S. Tsirkin
2015-10-01 10:59 ` Avi Kivity
2015-10-01 14:57 ` Stephen Hemminger
2015-10-01 19:48 ` Alexander Duyck
2015-10-01 22:00 ` Stephen Hemminger
2015-10-01 23:03 ` Alexander Duyck
2015-10-01 23:39 ` Stephen Hemminger
2015-10-01 23:43 ` Alexander Duyck
2015-10-02 0:04 ` Stephen Hemminger
2015-10-02 1:39 ` Alexander Duyck
2015-10-04 16:49 ` Vlad Zolotarov
2015-10-04 19:03 ` Greg KH
2015-10-04 20:49 ` Vlad Zolotarov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151001152251.GA25009@redhat.com \
--to=mst@redhat.com \
--cc=dev@dpdk.org \
--cc=gregkh@linux-foundation.org \
--cc=hjk@hansjkoch.de \
--cc=linux-kernel@vger.kernel.org \
--cc=stephen@networkplumber.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).