From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <avi@cloudius-systems.com>
Received: from mail-wi0-f170.google.com (mail-wi0-f170.google.com
 [209.85.212.170]) by dpdk.org (Postfix) with ESMTP id 1EBA95A86
 for <dev@dpdk.org>; Thu,  1 Oct 2015 12:23:56 +0200 (CEST)
Received: by wicfx3 with SMTP id fx3so21371250wic.0
 for <dev@dpdk.org>; Thu, 01 Oct 2015 03:23:56 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:subject:to:references:cc:from:message-id:date
 :user-agent:mime-version:in-reply-to:content-type
 :content-transfer-encoding;
 bh=gs+bsTOwG1J9OpwfsHMGmYIkcly1RWPCkO/cjLCj03I=;
 b=VAfa8PyMRqstXbiujDvTiZuE2laueMOD3KQ3IWALNLRL+mX1QCr6x1c/VcxYyRsEbO
 h5+JWl3b76n10kGToO/4pkcFvyqPHsbEGkMOOURfbb5sszvigVUpVR7JLOsIi6bAsmEW
 jMSzIRGTU2ssL16BL7VukrhEya04vLTLAzE4eHiJ1i1C1xL54OauV9br+DsqmoS3jOAE
 N5xAKlaxKQl40i9bTFq66/F8MehprkzaIMjpHz9/D3+ET9RvlLaQq+f2H27gTiVMwJ45
 CrYQUJJjbiUN2Auhpu/P7Snm8rD2jVna0o2W3hrr5QpbZBT3hdnLShj9dd3CHeQCJLGa
 v77w==
X-Gm-Message-State: ALoCoQnrppdTlAy/nzsn7N0zkEceF1cKndsxM+jBe1yJyUxJTdUhN7llM2AMeuq6RR4FMNbT1K4p
X-Received: by 10.180.91.194 with SMTP id cg2mr2640830wib.72.1443695035965;
 Thu, 01 Oct 2015 03:23:55 -0700 (PDT)
Received: from avi.cloudius ([37.142.229.250])
 by smtp.googlemail.com with ESMTPSA id lf10sm5376010wjb.23.2015.10.01.03.23.53
 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Thu, 01 Oct 2015 03:23:53 -0700 (PDT)
To: "Michael S. Tsirkin" <mst@redhat.com>
References: <560BF782.4070308@scylladb.com>
 <20150930175848-mutt-send-email-mst@redhat.com>
 <560C0171.7080507@scylladb.com> <20150930204016.GA29975@redhat.com>
 <20151001113828-mutt-send-email-mst@redhat.com> <560CF44A.60102@scylladb.com>
 <20151001120027-mutt-send-email-mst@redhat.com>
 <560CFB66.5050904@scylladb.com> <560CFFFF.4000601@6wind.com>
 <560D0059.5050003@scylladb.com>
 <20151001130844-mutt-send-email-mst@redhat.com>
From: Avi Kivity <avi@scylladb.com>
Message-ID: <560D09B8.2050407@scylladb.com>
Date: Thu, 1 Oct 2015 13:23:52 +0300
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101
 Thunderbird/38.2.0
MIME-Version: 1.0
In-Reply-To: <20151001130844-mutt-send-email-mst@redhat.com>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] Having troubles binding an SR-IOV VF to
 uio_pci_generic on Amazon instance
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Thu, 01 Oct 2015 10:23:56 -0000



On 10/01/2015 01:14 PM, Michael S. Tsirkin wrote:
> On Thu, Oct 01, 2015 at 12:43:53PM +0300, Avi Kivity wrote:
>>> There were some tentative to get it for other (older) drivers, named
>>> 'bifurcated drivers', but it is stalled.
>> IIRC they still exposed the ring to userspace.
> How much would a ring write syscall cost? 1-2 microseconds, isn't it?
> Measureable, but it's not the end of the world.

Plus a page table walk per packet fragment (dpdk has the physical 
address prepared in the mbuf IIRC).  The 10Mpps+ users of dpdk should 
comment on whether the performance hit is acceptable (my use case is 
much more modest).

> ring read might be safe to allow.
> Should buy us enough time until hypervisors support IOMMU.

All the relevant drivers need to be converted to support ring 
translation, and exposing the ring to userspace in the new API.  It 
shouldn't take more than 3-4 years.

Meanwhile, users of virtualized systems that need interrupt support 
cannot use their machines, while non-virtualized users are free to DMA 
wherever they like, in the name of security.

btw, an API like you describe already exists -- vhost.  Of course the 
virtio protocol is nowhere near fast enough, but at least it's an example.