From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 96882A00D7; Thu, 31 Oct 2019 18:04:02 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 5C9C61D156; Thu, 31 Oct 2019 18:04:02 +0100 (CET) Received: from new4-smtp.messagingengine.com (new4-smtp.messagingengine.com [66.111.4.230]) by dpdk.org (Postfix) with ESMTP id EE3571D150 for ; Thu, 31 Oct 2019 18:04:00 +0100 (CET) Received: from compute1.internal (compute1.nyi.internal [10.202.2.41]) by mailnew.nyi.internal (Postfix) with ESMTP id 2F8DB61FC; Thu, 31 Oct 2019 13:04:00 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute1.internal (MEProxy); Thu, 31 Oct 2019 13:04:00 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=monjalon.net; h= from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding:content-type; s=mesmtp; bh=wQjZ8xlcD6lO0TpYAsg3B9owzovZsWH9GxvPYArf2Zo=; b=kBtmS1Jz7ndN nGn11zOVUVoHUUBs2d6Xbi/Enxna8bm69rq1u3v5dOxxc7Jjx2XEysW8jRQ1TjYp Wz6LUBWFpWNKX4Ue38LPNdw3ynNSd5eWke916j7Utf4kAxFMcxvM98goSXHBLVJ2 co/aRmtY0qlu/DjyufT6FyUIjqB6w6E= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm1; bh=wQjZ8xlcD6lO0TpYAsg3B9owzovZsWH9GxvPYArf2 Zo=; b=vqDYtfi3fCYbQ7uXzOuLTdKryUVZ8Yj7oQxlso8rC5lcacJLi9MARTUPE 0G2Pgr/AhRZGPhgHdojkK4sM1kAlPEY0tD12cguS7p9k1vRCWxcOfGvXe3z3439F aQqXowvuZgqmY2NbxxB3Ws5bSJhi/4m4jjEzQZ9L0sNZn5Qka+iXvsk8aVK7t8RT 6+2lEM1gc8mGAM90Hop+PpeOjTNGyFS4gegAtKelqF2NGSnZXnnSeedy9Uq7Dshs Zs6zmjt5cfmRsqcItv0wSHnIOVaYZWM+HKz0V+IHHS741gT11jud4itXXOdtjoBs u9HndUF6+MUGFnCaG1uysQtncYmqw== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedufedruddthedgleegucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhephffvufffkfgjfhgggfgtsehtufertddttddvnecuhfhrohhmpefvhhhomhgr shcuofhonhhjrghlohhnuceothhhohhmrghssehmohhnjhgrlhhonhdrnhgvtheqnecuff homhgrihhnpehlkhhmlhdrohhrghdpkhgvrhhnvghlrdhorhhgpdhlfihnrdhnvghtnecu kfhppeejjedrudefgedrvddtfedrudekgeenucfrrghrrghmpehmrghilhhfrhhomhepth hhohhmrghssehmohhnjhgrlhhonhdrnhgvthenucevlhhushhtvghrufhiiigvpedt X-ME-Proxy: Received: from xps.localnet (184.203.134.77.rev.sfr.net [77.134.203.184]) by mail.messagingengine.com (Postfix) with ESMTPA id 2B7A08005A; Thu, 31 Oct 2019 13:03:56 -0400 (EDT) From: Thomas Monjalon To: dev@dpdk.org Cc: Jerin Jacob Kollanukkaran , Vamsi Krishna Attunuru , arybchenko@solarflare.com, ferruh.yigit@intel.com, maxime.coquelin@redhat.com, Stephen Hemminger , bruce.richardson@intel.com, Alex Williamson , david.marchand@redhat.com, bluca@debian.org, Christian Ehrhardt , ktraynor@redhat.com, anatoly.burakov@intel.com, konstantin.ananyev@intel.com, honnappa.nagarahalli@arm.com, Liang-Min Wang , Alexander Duyck , Peter Xu , Eric Auger Date: Thu, 31 Oct 2019 18:03:53 +0100 Message-ID: <1659615.GCIDYkGxRJ@xps> In-Reply-To: References: <20190906091230.13923-1-vattunuru@marvell.com> <1612178.XsdEgM4R2a@xps> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" Subject: Re: [dpdk-dev] [PATCH v1 1/1] kernel/linux: introduce vfio_pf kernel module X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" We don't get enough attention on this topic. Let me rephrase the issue and the proposals with more people Cc'ed. We are talking about SR-IOV VFs in VMs with a PF managed on the host by DPDK. The PF driver is either a (1) bifurcated (Mellanox case), or (2) bound to UIO with igb_uio, or (3) bound to VFIO. In case 1, the PF is still managed by a kernel driver, so no issue. In case 2, the PF is managed by UIO. There is no SR-IOV support in upstream UIO, but the out-of-tree module igb_uio works. However we would like to drop this legacy module from DPDK. Some (most) Linux distributions do not package igb_uio anyway. The other issue is that igb_uio is using physical addressing, which is not acceptable with OCTEON TX2 for performance reason. In case 3, the PF is managed by VFIO. This is the case we want to fix. VFIO does not allow to create VFs. The workaround is to create VFs before binding the PF to VFIO. But since Linux 4.19, VFIO forbids any SR-IOV VF management. There is a security concern about allowing userspace to manage SR-IOV VF messages and taking the responsibility for VFs in the guest. It is desired to allow the system admin deciding the security levels, by adding a flag in VFIO "let me manage VFs, I know what I am doing". Reference of "recent" discussion: https://lkml.org/lkml/2018/3/6/855 For now, there is no upstream solution merged. This patch is proposing a solution using an out-of-tree module. In this case, the admin will decide explicitly to bind the PF to vfio_pf. Unfortunately this solution won't work in environments which forbid any out-of-tree module. Another concern is that it looks like DPDK-only solution. We have an issue but we do not want to propose a half-solution which would harm other projects and users. So the question is: Do we accept this patch as a temporary solution? Or can we get an agreement soon for an upstream kernel solution? Thanks for reading and giving your (clear) opinion. 06/09/2019 15:27, Jerin Jacob Kollanukkaran: > From: Thomas Monjalon > > 06/09/2019 11:12, vattunuru@marvell.com: > > > From: Vamsi Attunuru > > > > > > The DPDK use case such as VF representer or OVS offload etc would call > > > for PF and VF PCIe devices to bind vfio-pci module to enable IOMMU > > > protection. > > > > > > In addition to vSwitch use case, unlike, other PCI class of devices, > > > Network class of PCIe devices would have additional responsibility on > > > the PF devices such as promiscuous mode support etc. > > > > > > The above use cases demand VFIO needs bound to PF and its VF devices. > > > This is use case is not supported in Linux kernel, due to a security > > > issue where it is possible to have DoS in case if VF attached to guest > > > over vfio-pci and netdev kernel driver runs on it and which something > > > VF representer would like to enable it. > > > > > > Since we can not differentiate, the vfio-pci bounded VF devices runs > > > DPDK application or netdev driver in guest, we can not introduce any > > > scheme to fix DoS case and therefore not have proper support of this > > > in the upstream kernel. > > > > > > The igb_uio enables such PF and VF binding support for non-iommu > > > devices to make VF representer or OVS offload run on non-iommu devices > > > with DoS vulnerability for netdev driver as VF. > > > > > > This kernel module, facilitate to enable SRIOV on PF devices, > > > therefore, to run both PF and VF devices in VFIO mode knowing its > > > impacts like igb_uio driver functions of non-iommu devices. > > > > > > Signed-off-by: Vamsi Attunuru > > > Signed-off-by: Jerin Jacob > > > > Sorry I fail to properly understand the explanation above. > > Please try to split in shorter sentences. > > > > About the request to add an out-of-tree Linux kernel driver, I guess Jerin is well > > aware that we don't want such anymore. > > Yes. I am aware of it. I don't like the out of tree modules either. But, This case, > I suggested Vamsi to have out of tree module. > > Let me describe the issue and let us discuss how to tackle the problem: > > # Linux kernel wont allow VFIO PF to have SRIOV enable. > > Patches and on going discussion are here: > https://patchwork.kernel.org/patch/10522381/ > https://lwn.net/Articles/748526/ > > Based on my understanding the reason for NOT allowing the > VFIO PF to have SRIOV enable is genuine from kernel point of > View but not from DPDK point of view. > > Here is the sequence to describe the problem > 1) Consider Linux kernel allowed VFIO PCI SRIOV enable > 2) PF bound to vfio-pci > 3) using SRIOV infrastructure of vfio-pci PF driver, > VFs are created > 4) DPDK application bound to PF and VF, No issue here. > 5) Assume DPDK application bound to PF and VF bound > To netdev kernel driver. Now, there is a genuine concern > From kernel point of view that, DPDK PF can intercept, > VF mailbox message or so and deny the Kernel request > Or what if DPDK PF application crashes? > > To avoid the case (5), (3) is not allowed in stock kernel. > Which makes sense IMO. > > Now, From DPDK PoV, step 5 is valid as we have > Rte_flow's VF action etc used to enable such case. > Where, user can program the PF's rte_flow to steer > Some traffic to VF, where VF can be, DPDK application or > Linux kernel netdev driver. > > This patch enables the step (3) to enable step (5) from DPDK > PoV. i.e DPDK needs to allow PF to bind to DPDK with VFs. > > Why this issue now: > - igb_uio kernel driver is used as enabling step (3) > See store_max_vfs() kernel/linux/igb_uio/igb_uio.c > This is fine for non-iommu device, IOMMU devices > needs VFIO. > - We would like support VFIO for IOMMU protection > And enable step (5) as DPDK supports form the spec level. > i.e need to fix feature disparity between iommu vs > non-iommu based devices. > > Note: > We may not need a brand new kernel module, we could move > this logic to igb_uio if maintenance is concern.