From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f50.google.com (mail-pa0-f50.google.com [209.85.220.50]) by dpdk.org (Postfix) with ESMTP id 527C08E66 for ; Thu, 1 Oct 2015 19:26:10 +0200 (CEST) Received: by padhy16 with SMTP id hy16so80388890pad.1 for ; Thu, 01 Oct 2015 10:26:09 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:mime-version:content-type:content-transfer-encoding; bh=oF7lwp0xnjLwyI5bnZBz0d9MUf/k0ixx/gxZAUDW9Dw=; b=fWFCrPgzDGTnQeFj4cInD0byOn4H7q6B4KKQgzQUtNXHJ4HZQbnLq34GgbZWa6K8hU V920vLneoEUJQgIhEl6zjJvsth0rOmNfbhh4eXBSX9310q22+NWWlZtyVjFrBUlQwOvF NG/iECbRAjegpvYmt3IByKDTvhWVz0Sa571VuZo+HVIPcUTYapckw4/fbXziQUp9KSge z06vsROrVtz6LwdswIJO9aI3c200NO1H9pr5/Lsb9MPP5XmKrEPLJIP5tZzZ72BKGwFE gflByJf0Wr+0zo1pECRBlI5byPPZXQ6D5eAknRMuvYDIXP/cyRKgTrbo0dNCyWV2S6GB rEvg== X-Gm-Message-State: ALoCoQlr9+qJcfrqkV3/V3F0IZLffU9IHQufnVp1aY+wmNRzUVadCQbXoZPvUlh184lU4RAd+eNV X-Received: by 10.68.111.3 with SMTP id ie3mr13660503pbb.63.1443720369645; Thu, 01 Oct 2015 10:26:09 -0700 (PDT) Received: from urahara (static-50-53-82-155.bvtn.or.frontiernet.net. [50.53.82.155]) by smtp.gmail.com with ESMTPSA id bk8sm7833970pad.18.2015.10.01.10.26.09 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 01 Oct 2015 10:26:09 -0700 (PDT) Date: Thu, 1 Oct 2015 10:26:19 -0700 From: Stephen Hemminger To: "Michael S. Tsirkin" Message-ID: <20151001102619.1944fffa@urahara> In-Reply-To: <20151001192911-mutt-send-email-mst@redhat.com> References: <1443652138-31782-1-git-send-email-stephen@networkplumber.org> <1443652138-31782-3-git-send-email-stephen@networkplumber.org> <20151001104505-mutt-send-email-mst@redhat.com> <20151001192911-mutt-send-email-mst@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: dev@dpdk.org, hjk@hansjkoch.de, gregkh@linux-foundation.org, linux-kernel@vger.kernel.org Subject: Re: [dpdk-dev] [PATCH 2/2] uio: new driver to support PCI MSI-X X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 01 Oct 2015 17:26:10 -0000 On Thu, 1 Oct 2015 19:31:08 +0300 "Michael S. Tsirkin" wrote: > On Thu, Oct 01, 2015 at 11:33:06AM +0300, Michael S. Tsirkin wrote: > > On Wed, Sep 30, 2015 at 03:28:58PM -0700, Stephen Hemminger wrote: > > > This driver allows using PCI device with Message Signalled Interrupt > > > from userspace. The API is similar to the igb_uio driver used by the DPDK. > > > Via ioctl it provides a mechanism to map MSI-X interrupts into event > > > file descriptors similar to VFIO. > > > > > > VFIO is a better choice if IOMMU is available, but often userspace drivers > > > have to work in environments where IOMMU support (real or emulated) is > > > not available. All UIO drivers that support DMA are not secure against > > > rogue userspace applications programming DMA hardware to access > > > private memory; this driver is no less secure than existing code. > > > > > > Signed-off-by: Stephen Hemminger > > > > I don't think copying the igb_uio interface is a good idea. > > What DPDK is doing with igb_uio (and indeed uio_pci_generic) > > is abusing the sysfs BAR access to provide unlimited > > access to hardware. > > > > MSI messages are memory writes so any generic device capable > > of MSI is capable of corrupting kernel memory. > > This means that a bug in userspace will lead to kernel memory corruption > > and crashes. This is something distributions can't support. > > > > uio_pci_generic is already abused like that, mostly > > because when I wrote it, I didn't add enough protections > > against using it with DMA capable devices, > > and we can't go back and break working userspace. > > But at least it does not bind to VFs which all of > > them are capable of DMA. > > > > The result of merging this driver will be userspace abusing the > > sysfs BAR access with VFs as well, and we do not want that. > > > > > > Just forwarding events is not enough to make a valid driver. > > What is missing is a way to access the device in a safe way. > > > > On a more positive note: > > > > What would be a reasonable interface? One that does the following > > in kernel: > > > > 1. initializes device rings (can be in pinned userspace memory, > > but can not be writeable by userspace), brings up interface link > > 2. pins userspace memory (unless using e.g. hugetlbfs) > > 3. gets request, make sure it's valid and belongs to > > the correct task, put it in the ring > > 4. in the reverse direction, notify userspace when buffers > > are available in the ring > > 5. notify userspace about MSI (what this driver does) > > > > What userspace can be allowed to do: > > > > format requests (e.g. transmit, receive) in userspace > > read ring contents > > > > What userspace can't be allowed to do: > > > > access BAR > > write rings > > > > > > This means that the driver can not be a generic one, > > and there will be a system call overhead when you > > write the ring, but that's the price you have to > > pay for ability to run on systems without an IOMMU. > > > > > The device specific parts can be taken from John Fastabend's patches > BTW: > > https://patchwork.ozlabs.org/patch/396713/ > > IIUC what was missing there was exactly the memory protection > we are looking for here. The bifuricated drivers are interesting from an architecture point of view, but do nothing to solve the immediate use case. The problem is not on bare metal environment, most of those already have IOMMU. The issues are on environments like VMWare with SRIOV or vmxnet3, neither of those are really helped by bifirucated driver or VFIO.