From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id 5257D8D9E for ; Wed, 30 Sep 2015 17:21:24 +0200 (CEST) Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23]) by mx1.redhat.com (Postfix) with ESMTPS id 9EE52C0B2B5D; Wed, 30 Sep 2015 15:21:22 +0000 (UTC) Received: from redhat.com (ovpn-116-83.ams2.redhat.com [10.36.116.83]) by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with SMTP id t8UFLJBw024676; Wed, 30 Sep 2015 11:21:20 -0400 Date: Wed, 30 Sep 2015 18:21:19 +0300 From: "Michael S. Tsirkin" To: Avi Kivity Message-ID: <20150930175848-mutt-send-email-mst@redhat.com> References: <20150930134533-mutt-send-email-mst@redhat.com> <560BC6C9.4020505@cloudius-systems.com> <20150930143927-mutt-send-email-mst@redhat.com> <560BCD2F.5060505@cloudius-systems.com> <20150930150115-mutt-send-email-mst@redhat.com> <560BD284.7040505@cloudius-systems.com> <20150930151632-mutt-send-email-mst@redhat.com> <560BDE24.8000308@scylladb.com> <20150930165359-mutt-send-email-mst@redhat.com> <560BF782.4070308@scylladb.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <560BF782.4070308@scylladb.com> X-Scanned-By: MIMEDefang 2.68 on 10.5.11.23 Cc: "dev@dpdk.org" Subject: Re: [dpdk-dev] Having troubles binding an SR-IOV VF to uio_pci_generic on Amazon instance X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Sep 2015 15:21:25 -0000 On Wed, Sep 30, 2015 at 05:53:54PM +0300, Avi Kivity wrote: > On 09/30/2015 05:39 PM, Michael S. Tsirkin wrote: > >On Wed, Sep 30, 2015 at 04:05:40PM +0300, Avi Kivity wrote: > >> > >>On 09/30/2015 03:27 PM, Michael S. Tsirkin wrote: > >>>On Wed, Sep 30, 2015 at 03:16:04PM +0300, Vlad Zolotarov wrote: > >>>>On 09/30/15 15:03, Michael S. Tsirkin wrote: > >>>>>On Wed, Sep 30, 2015 at 02:53:19PM +0300, Vlad Zolotarov wrote: > >>>>>>On 09/30/15 14:41, Michael S. Tsirkin wrote: > >>>>>>>On Wed, Sep 30, 2015 at 02:26:01PM +0300, Vlad Zolotarov wrote: > >>>>>>>>The whole idea is to bypass kernel. Especially for networking... > >>>>>>>... on dumb hardware that doesn't support doing that securely. > >>>>>>On a very capable HW that supports whatever security requirements needed > >>>>>>(e.g. 82599 Intel's SR-IOV VF devices). > >>>>>Network card type is irrelevant as long as you do not have an IOMMU, > >>>>>otherwise you would just use e.g. VFIO. > >>>>Sorry, but I don't follow your logic here - Amazon EC2 environment is a > >>>>example where there *is* iommu but it's not virtualized > >>>>and thus VFIO is > >>>>useless and there is an option to use directly assigned SR-IOV networking > >>>>device there where using the kernel drivers impose a performance impact > >>>>compared to user space UIO-based user space kernel bypass mode of usage. How > >>>>is it irrelevant? Could u, pls, clarify your point? > >>>> > >>>So it's not even dumb hardware, it's another piece of software > >>>that forces an "all or nothing" approach where either > >>>device has access to all VM memory, or none. > >>>And this, unfortunately, leaves you with no secure way to > >>>allow userspace drivers. > >>Some setups don't need security (they are single-user, single application). > >>But do need a lot of performance (like 5X-10X performance). An example is > >>OpenVSwitch, security doesn't help it at all and if you force it to use the > >>kernel drivers you cripple it. > >We'd have to see there are actual users that need this. So far, dpdk > >seems like the only one, > > dpdk is a whole class if users. It's not a specific application. > > > and it wants to use UIO for slow path stuff > >like polling link status. Why this needs kernel bypass support, I don't > >know. I asked, and got no answer. > > First, it's more than link status. dpdk also has an interrupt mode, which > applications can fall back to when when the load is light in order to save > power (and in order not to get support calls about 100% cpu when idle). Aha, looks like it appeared in June. Interesting, thanks for the info. > Even for link status, you don't want to poll for that, because accessing > device registers is expensive. An interrupt is the best approach for rare > events like link changed. Yea, but you probably can get by with a timer for that, even if it's ugly. > >>Also, I'm root. I can do anything I like, including loading a patched > >>pci_uio_generic. You're not providing _any_ security, you're simply making > >>life harder for users. > >Maybe that's true on your system. But I guess you know that's not true > >for everyone, not in 2015. > > Why is it not true? if I'm root, I can do anything I like to my > system, and everyone is root in 2015. I can access the BARs directly > and program DMA, how am I more secure by uio not allowing me to setup > msix? That's not the point. The point always was that using uio for these devices (capable of DMA, in particular of msix) isn't possible in a secure way. And yes, if same device happens to also do interrupts, UIO does not reject it as it probably should, and we can't change this without breaking some working setups. But this doesn't mean we should add more setups like this that we'll then be forced to maintain. > Non-root users are already secured by their inability to load the module, > and by the device permissions. > > > > >>>So it makes even less sense to add insecure work-arounds in the kernel. > >>>It seems quite likely that by the time the new kernel reaches > >>>production X years from now, EC2 will have a virtual iommu. > >>I can adopt a new kernel tomorrow. I have no influence on EC2. > >> > >> > >Xen grant tables sound like they could be the right interface > >for EC2. google search for "grant tables iommu" immediately gives me: > >http://lists.xenproject.org/archives/html/xen-devel/2014-04/msg00963.html > >Maybe latest Xen is already doing the right thing, and it's just the > >question of making VFIO use that. > > > > grant tables only work for virtual devices, not physical devices. Why not? That's what the patches above seem to do. -- MST