DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Stephen Hemminger <stephen@networkplumber.org>
Cc: dev@dpdk.org, hjk@hansjkoch.de, gregkh@linux-foundation.org,
	linux-kernel@vger.kernel.org
Subject: Re: [dpdk-dev] [PATCH 2/2] uio: new driver to support PCI MSI-X
Date: Thu, 1 Oct 2015 18:22:51 +0300	[thread overview]
Message-ID: <20151001152251.GA25009@redhat.com> (raw)
In-Reply-To: <20151001075037.61c43f63@urahara>

On Thu, Oct 01, 2015 at 07:50:37AM -0700, Stephen Hemminger wrote:
> On Thu, 1 Oct 2015 11:33:06 +0300
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > On Wed, Sep 30, 2015 at 03:28:58PM -0700, Stephen Hemminger wrote:
> > > This driver allows using PCI device with Message Signalled Interrupt
> > > from userspace. The API is similar to the igb_uio driver used by the DPDK.
> > > Via ioctl it provides a mechanism to map MSI-X interrupts into event
> > > file descriptors similar to VFIO.
> > >
> > > VFIO is a better choice if IOMMU is available, but often userspace drivers
> > > have to work in environments where IOMMU support (real or emulated) is
> > > not available.  All UIO drivers that support DMA are not secure against
> > > rogue userspace applications programming DMA hardware to access
> > > private memory; this driver is no less secure than existing code.
> > > 
> > > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> > 
> > I don't think copying the igb_uio interface is a good idea.
> > What DPDK is doing with igb_uio (and indeed uio_pci_generic)
> > is abusing the sysfs BAR access to provide unlimited
> > access to hardware.
> > 
> > MSI messages are memory writes so any generic device capable
> > of MSI is capable of corrupting kernel memory.
> > This means that a bug in userspace will lead to kernel memory corruption
> > and crashes.  This is something distributions can't support.
> > 
> > uio_pci_generic is already abused like that, mostly
> > because when I wrote it, I didn't add enough protections
> > against using it with DMA capable devices,
> > and we can't go back and break working userspace.
> > But at least it does not bind to VFs which all of
> > them are capable of DMA.
> > 
> > The result of merging this driver will be userspace abusing the
> > sysfs BAR access with VFs as well, and we do not want that.
> > 
> > 
> > Just forwarding events is not enough to make a valid driver.
> > What is missing is a way to access the device in a safe way.
> > 
> > On a more positive note:
> > 
> > What would be a reasonable interface? One that does the following
> > in kernel:
> > 
> > 1. initializes device rings (can be in pinned userspace memory,
> >    but can not be writeable by userspace), brings up interface link
> > 2. pins userspace memory (unless using e.g. hugetlbfs)
> > 3. gets request, make sure it's valid and belongs to
> >    the correct task, put it in the ring
> > 4. in the reverse direction, notify userspace when buffers
> >    are available in the ring
> > 5. notify userspace about MSI (what this driver does)
> > 
> > What userspace can be allowed to do:
> > 
> > 	format requests (e.g. transmit, receive) in userspace
> > 	read ring contents
> > 
> > What userspace can't be allowed to do:
> > 
> > 	access BAR
> > 	write rings
> > 
> > 
> > This means that the driver can not be a generic one,
> > and there will be a system call overhead when you
> > write the ring, but that's the price you have to
> > pay for ability to run on systems without an IOMMU.
> 
> I think I understand what you are proposing, but it really doesn't
> fit into the high speed userspace networking model.

I'm aware of the fact currently the model does everything including
bringing up the link in user-space.
But there's really no justification for this.
Only data path things should be in userspace.

A userspace bug should not be able to do things like over-writing the
on-device EEPROM.


> 1. Device rings are device specific, can't be in a generic driver.

So that's more work, and it is not going to happen if people
can get by with insecure hacks.

> 2. DPDK uses huge mememory.

Hugetlbfs? Don't see why this is an issue. Might make things simpler.

> 3. Performance requires all ring requests be done in pure userspace,
>    (ie no syscalls)

Make only the TX ring writeable then. At least you won't be
able to corrupt the kernel memory.

> 4. Ditto, can't have kernel to userspace notification per packet

RX ring can be read-only, so userspace can read it directly.

-- 
MST

  reply	other threads:[~2015-10-01 15:22 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-30 22:28 [dpdk-dev] [PATCH 0/2] uio_msi: device driver Stephen Hemminger
2015-09-30 22:28 ` [dpdk-dev] [PATCH 1/2] uio: add support for ioctls Stephen Hemminger
2015-09-30 22:28 ` [dpdk-dev] [PATCH 2/2] uio: new driver to support PCI MSI-X Stephen Hemminger
2015-10-01  8:33   ` Michael S. Tsirkin
2015-10-01 10:37     ` Michael S. Tsirkin
2015-10-01 16:06       ` Michael S. Tsirkin
2015-10-01 14:50     ` Stephen Hemminger
2015-10-01 15:22       ` Michael S. Tsirkin [this message]
2015-10-01 16:31     ` Michael S. Tsirkin
2015-10-01 17:26       ` Stephen Hemminger
2015-10-01 18:25         ` Michael S. Tsirkin
2015-10-05 21:54     ` Michael S. Tsirkin
2015-10-05 22:09       ` Vladislav Zolotarov
2015-10-05 22:49         ` Michael S. Tsirkin
2015-10-06  7:33           ` Stephen Hemminger
2015-10-06 12:15             ` Avi Kivity
2015-10-06 14:07               ` Michael S. Tsirkin
2015-10-06 15:41                 ` Avi Kivity
2015-10-16 17:11               ` Thomas Monjalon
2015-10-16 17:20                 ` Stephen Hemminger
2015-10-06 13:42             ` Michael S. Tsirkin
2015-10-06  8:23           ` Vlad Zolotarov
2015-10-06 13:58             ` Michael S. Tsirkin
2015-10-06 14:49               ` Vlad Zolotarov
2015-10-06 15:00                 ` Michael S. Tsirkin
2015-10-06 16:40                   ` Vlad Zolotarov
2015-10-01 23:40   ` Alexander Duyck
2015-10-02  0:01     ` Stephen Hemminger
2015-10-02  1:21       ` Alexander Duyck
2015-10-02  0:04     ` Stephen Hemminger
2015-10-02  2:33       ` Alexander Duyck
2015-10-01  8:36 ` [dpdk-dev] [PATCH 0/2] uio_msi: device driver Michael S. Tsirkin
2015-10-01 10:59 ` Avi Kivity
2015-10-01 14:57   ` Stephen Hemminger
2015-10-01 19:48     ` Alexander Duyck
2015-10-01 22:00       ` Stephen Hemminger
2015-10-01 23:03         ` Alexander Duyck
2015-10-01 23:39           ` Stephen Hemminger
2015-10-01 23:43             ` Alexander Duyck
2015-10-02  0:04               ` Stephen Hemminger
2015-10-02  1:39                 ` Alexander Duyck
2015-10-04 16:49                   ` Vlad Zolotarov
2015-10-04 19:03                     ` Greg KH
2015-10-04 20:49                       ` Vlad Zolotarov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151001152251.GA25009@redhat.com \
    --to=mst@redhat.com \
    --cc=dev@dpdk.org \
    --cc=gregkh@linux-foundation.org \
    --cc=hjk@hansjkoch.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stephen@networkplumber.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).