From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id D518EA09E4; Thu, 21 Jan 2021 16:38:57 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 7B9A4140D03; Thu, 21 Jan 2021 16:38:57 +0100 (CET) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by mails.dpdk.org (Postfix) with ESMTP id 86E35140CD6 for ; Thu, 21 Jan 2021 16:38:56 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1611243536; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=e35wOnpiYBD7mvoBnp+8EFSOOW7G3ugJF8zg2/nCSpc=; b=OeFL2btEAENMUHgEQnzYEEOaVx3FyPclFqbKgXUUvK42xSssS5Z+m1/1hpdJeULgYg54Ez J4mIcAalROCaPYgdCbFZQ7XeGPsKu1stbogzISlHY2Q620kp+6aO7oYbBICnJz3Gac1J49 nc3YY63tFvS6YQ3m/0crylTLAD1gSZ8= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-585-fSngBDPaMsao3D_quP-3Tw-1; Thu, 21 Jan 2021 10:38:52 -0500 X-MC-Unique: fSngBDPaMsao3D_quP-3Tw-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 9780F81433E; Thu, 21 Jan 2021 15:38:41 +0000 (UTC) Received: from [10.36.110.2] (unknown [10.36.110.2]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 926182CC9A; Thu, 21 Jan 2021 15:38:17 +0000 (UTC) To: =?UTF-8?B?6LCi5Y2O5LyfKOatpOaXtuatpOWIu++8iQ==?= , ferruh.yigit@intel.com Cc: dev@dpdk.org, anatoly.burakov@intel.com, david.marchand@redhat.com, zhihong.wang@intel.com, chenbo.xia@intel.com, grive@u256.net References: <68ecd941-9c56-4de7-fae2-2ad15bdfd81a@alibaba-inc.com> <1603381885-88819-1-git-send-email-huawei.xhw@alibaba-inc.com> <1603381885-88819-4-git-send-email-huawei.xhw@alibaba-inc.com> <18871462-4d25-302a-2716-99ebec65c3ac@alibaba-inc.com> <40e0702d-7847-9dc3-1904-03a7b8e92c2e@alibaba-inc.com> From: Maxime Coquelin Message-ID: <3c83a06d-c757-e470-441b-a8b7f496a953@redhat.com> Date: Thu, 21 Jan 2021 16:38:14 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0 MIME-Version: 1.0 In-Reply-To: <40e0702d-7847-9dc3-1904-03a7b8e92c2e@alibaba-inc.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=maxime.coquelin@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [dpdk-dev] [PATCH v5 3/3] PCI: don't use vfio ioctl call to access PIO resource X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 1/21/21 3:57 PM, 谢华伟(此时此刻) wrote: > > On 2021/1/21 16:29, Maxime Coquelin wrote: >> >> On 1/20/21 3:54 PM, 谢华伟(此时此刻) wrote: >>> On 2021/1/13 0:58, Maxime Coquelin wrote: >>>> On 1/12/21 10:37 AM, Maxime Coquelin wrote: >>>>> bus/pci: ... >>>>> >>>>> On 10/22/20 5:51 PM, 谢华伟(此时此刻) wrote: >>>>>> From: "huawei.xhw" >>>>>> >>>>>> VFIO should use the same way to map/read/write PORT IO as UIO, for >>>>>> virtio PMD. >>>>> Please provide more details in the commit message on why the way VFIO >>>>> works today is wrong (The cover letter is lost once applied). >>> ok >>>>>> Signed-off-by: huawei.xhw >>>>> Same comment about name format as on previous patches. >>>>> >>>>>> --- >>>>>>    drivers/bus/pci/linux/pci.c     | 8 ++++---- >>>>>>    drivers/bus/pci/linux/pci_uio.c | 4 +++- >>>>>>    2 files changed, 7 insertions(+), 5 deletions(-) >>>>>> >>>>>> diff --git a/drivers/bus/pci/linux/pci.c >>>>>> b/drivers/bus/pci/linux/pci.c >>>>>> index 0dc99e9..2ed9f2b 100644 >>>>>> --- a/drivers/bus/pci/linux/pci.c >>>>>> +++ b/drivers/bus/pci/linux/pci.c >>>>>> @@ -687,7 +687,7 @@ int rte_pci_write_config(const struct >>>>>> rte_pci_device *device, >>>>>>    #ifdef VFIO_PRESENT >>>>>>        case RTE_PCI_KDRV_VFIO: >>>>>>            if (pci_vfio_is_enabled()) >>>>>> -            ret = pci_vfio_ioport_map(dev, bar, p); >>>>>> +            ret = pci_uio_ioport_map(dev, bar, p); >>>>> Doesn't it create a regression with regards to needed capabilities? >>>>> My understanding is that before this patch we don't need to call >>>>> iopl(), >>>>> whereas once applied it is required, correct? >>>> I did some testing today, and think it is not a regression with para- >>>> virtualized Virtio devices. >>>> >>>> Indeed, I thought it would be a regression with Legacy devices when >>>> IOMMU is enabled and the program is run as non-root (IOMMU enabled >>>> just to suport IOVA as VA mode). But it turns out para-virtualized >>>> Virtio legacy device and vIOMMU enabled is not a supported >>>> configuration >>>> by QEMU. >>>> >>>> Note that when noiommu mode is enabled, the app needs cap_sys_rawio, so >>>> same as iopl(). No regression in this case too. >>>> >>>> That said, with real (non para-virtualized) Virtio device using PIO >>>> like >>>> yours, doesn't your patch introduce a restriction for your device that >>>> it will require cap_sys_rawio whereas it would not be needed? >>> I don't catch the regression issue. >>> >>> With real virtio device(hardware implemented), if it is using MMIO, no >>> cap_sys_rawio is required. >>> >>> If it is using PIO, iopl is required always. >> My understanding of the Kernel VFIO driver is that cap_sys_rawio is only >> necessary in noiommu mode, i.e. when VFIO is loaded with >> enable_unsafe_noiommu parameter set. The doc for this parameters seems >> to validate my understanding of the code: >> " >> MODULE_PARM_DESC(enable_unsafe_noiommu_mode, "Enable UNSAFE, no-IOMMU >> mode.  This mode provides no device isolation, no DMA translation, no >> host kernel protection, cannot be used for device assignment to virtual >> machines, requires RAWIO permissions, and will taint the kernel.  If you >> do not know what this is for, step away. (default: false)"); >> " >> >> I think that using inb/outb in the case of VFIO with IOMMU enabled won't >> work without cap_sys_rawio, and using it in the case of VFIO with IOMMU >> disabled just bypasses VFIO and so is not correct. > > Get your concern. > > PIO bar: > >     HW virtio on HW machine: any vendor implements hardware virtio using > PIO bar? I think this isn't right. And i dout if vfio doesn't check > rawio perssion in the syscall in this case. I checked VFIO code, and it only check for rawio permission if noiommu mode is enabled. >     Para virtio:  you have no choice to enable unsafe no-iommu mode.  > You must have RAWIO permission. > > so with PIO bar, the regression doesn't exist in real world. > > > Btw, our virtio device is basically MMIO bar, either in hardware machine > or in pass-throughed virtual machine. OK, that thing was not clear to me. > > Do you mean we apply or abandon patch 3? I am both OK. The first > priority to me is to enable MMIO bar support. OK, so yes, I think we should abandon patch 2 and patch 3. For patch 1, it looks valid to me, but I'll let Ferruh decide. For your device, if my understanding is correct, what we need to do is to support MMIO for legacy devices. Correct? If so, the change should be in virtio_pci.c. In vtpci_init(), after modern detection has failed, we should check the the BAR is PIO or MMIO based on the flag. the result can be saved in struct virtio_pci_dev. We would introduce new wrappers like vtpci_legacy_read, vtpci_legacy_write that would either call rte_pci_ioport_read, rte_pci_ioport_read in case of PIO, or rte_read32, rte_write32 in case of MMIO. It is not too late for this release, as the change will not be that intrusive. But if you prepare such patch, please base it on top of my virtio rework series; To make it easier to you, I added it to the dpdk- next-virtio tree: https://git.dpdk.org/next/dpdk-next-virtio/log/?h=virtio_pmd_rework_v2 Thanks, Maxime > >> In my opinion, what we should do is to add something like this in the >> DPDK documentation: >> >>   - MMIO BAR: VFIO with IOMMU enabled recommended. Equivalent performance >> as with IGB UIO or VFIO with NOIOMMU. VFIO with IOMMU is recommended for >> security reasons. >>   - PIO BAR: VFIO with IOMMU enabled is recommended for security reasons, >> providing proper isolation and not requiring cap_sys_rawio. However, use >> of IOMMU is not always possible in some cases (e.g. para-virtualized >> Virtio-net legacy device). Also, performance of using VFIO for PIO BARs >> accesses has an impact on performance as it uses pread/pwrite syscalls, >> whereas UIO drivers use inb/outb. If security is not a concern or IOMMU >> is not available, one might consider using UIO driver in this case for >> performance reasons. >> >> What do you think? >>>> Thanks, >>>> Maxime >>>> >>>>> Regards, >>>>> Maxime >>>>> >