From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 833DBA0350; Thu, 25 Jun 2020 06:00:24 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id D56D44F9C; Thu, 25 Jun 2020 06:00:23 +0200 (CEST) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id BFD032B96; Thu, 25 Jun 2020 06:00:22 +0200 (CEST) IronPort-SDR: G6ez4jX87WZDd9C9vUngmDIe975wCdJMkIYxYNOWATnLwvqPgUROm11Mbqf2Wdui+gEM5ZLp3Q +U7POB+lKjzQ== X-IronPort-AV: E=McAfee;i="6000,8403,9662"; a="133171129" X-IronPort-AV: E=Sophos;i="5.75,277,1589266800"; d="scan'208";a="133171129" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jun 2020 21:00:21 -0700 IronPort-SDR: s6dYXFntP5bU1XQh4+4gD8djQgmVt6biDXznNnxv84YTb9ijF+v2n1Rm83mN02/Ze8CrjIhUxm DOccJ1bzKuWQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,277,1589266800"; d="scan'208";a="423595508" Received: from npg-dpdk-haiyue-1.sh.intel.com ([10.67.119.213]) by orsmga004.jf.intel.com with ESMTP; 24 Jun 2020 21:00:19 -0700 From: Haiyue Wang To: dev@dpdk.org, anatoly.burakov@intel.com Cc: Haiyue Wang , stable@dpdk.org, Harman Kalra , David Marchand Date: Thu, 25 Jun 2020 11:50:46 +0800 Message-Id: <20200625035046.19820-1-haiyue.wang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20200621174035.6858-1-haiyue.wang@intel.com> References: <20200621174035.6858-1-haiyue.wang@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Subject: [dpdk-dev] [PATCH v4] bus/pci: fix VF bus error for memory access X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" To fix CVE-2020-12888, the linux vfio-pci module will invalidate mmaps and block MMIO access on disabled memory, it will send a SIGBUS to the application: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=abafbc551fddede3e0a08dee1dcde08fc0eb8476 When the application opens the vfio PCI device, the vfio-pci module will enable the bus memory space through PCI read/write access. According to the PCIe specification, the 'Memory Space Enable' is always zero for VF: Table 9-13 Command Register Changes Bit Location | PF and VF Register Differences | PF | VF | From Base | Attributes | Attributes -------------+--------------------------------+------------+----------- | Memory Space Enable - Does not | | | apply to VFs. Must be hardwired| Base | 0b 1 | to 0b for VFs. VF Memory Space | | | is controlled by the VF MSE bit| | | in the VF Control register. | | -------------+--------------------------------+------------+----------- Afterwards the vfio-pci will initialize its own virtual PCI config space data ('vconfig') by reading the VF's physical PCI config space, then the 'Memory Space Enable' bit in vconfig will always be 0b value. This will make the vfio-pci treat the BAR memory space as disabled, and the SIGBUS will be triggered if access these BARs. By investigation, the VF PCI device *passthrough* into the Guest OS by QEMU has the 'Memory Space Enable' with 1b value. That's because every PCI driver will start to enable the memory space, and this action will be hooked by vfio-pci virtual PCI read/write to set the 'Memory Space Enable' in vconfig space to 1b. So VF runs in guest OS has 'Mem+', but VF runs in host OS has 'Mem-'. Align with PCI working mode in Guest/QEMU/Host, in DPDK, enable the PCI bus memory space explicitly to avoid access on disabled memory. Fixes: 33604c31354a ("vfio: refactor PCI BAR mapping") Cc: stable@dpdk.org Signed-off-by: Haiyue Wang Acked-by: Anatoly Burakov Tested-by: Harman Kalra Tested-by: David Marchand --- v4: Fix commit message typo. v3: Update the commit log, and fix one debug log with redundant description. v2: Rewrite the commit log, and put the link into it even it is long. --- drivers/bus/pci/linux/pci_vfio.c | 37 ++++++++++++++++++++++++++++++++ 1 file changed, 37 insertions(+) diff --git a/drivers/bus/pci/linux/pci_vfio.c b/drivers/bus/pci/linux/pci_vfio.c index 64cd84a68..ba60e7ce9 100644 --- a/drivers/bus/pci/linux/pci_vfio.c +++ b/drivers/bus/pci/linux/pci_vfio.c @@ -149,6 +149,38 @@ pci_vfio_get_msix_bar(int fd, struct pci_msix_table *msix_table) return 0; } +/* enable PCI bus memory space */ +static int +pci_vfio_enable_bus_memory(int dev_fd) +{ + uint16_t cmd; + int ret; + + ret = pread64(dev_fd, &cmd, sizeof(cmd), + VFIO_GET_REGION_ADDR(VFIO_PCI_CONFIG_REGION_INDEX) + + PCI_COMMAND); + + if (ret != sizeof(cmd)) { + RTE_LOG(ERR, EAL, "Cannot read command from PCI config space!\n"); + return -1; + } + + if (cmd & PCI_COMMAND_MEMORY) + return 0; + + cmd |= PCI_COMMAND_MEMORY; + ret = pwrite64(dev_fd, &cmd, sizeof(cmd), + VFIO_GET_REGION_ADDR(VFIO_PCI_CONFIG_REGION_INDEX) + + PCI_COMMAND); + + if (ret != sizeof(cmd)) { + RTE_LOG(ERR, EAL, "Cannot write command to PCI config space!\n"); + return -1; + } + + return 0; +} + /* set PCI bus mastering */ static int pci_vfio_set_bus_master(int dev_fd, bool op) @@ -427,6 +459,11 @@ pci_rte_vfio_setup_device(struct rte_pci_device *dev, int vfio_dev_fd) return -1; } + if (pci_vfio_enable_bus_memory(vfio_dev_fd)) { + RTE_LOG(ERR, EAL, "Cannot enable bus memory!\n"); + return -1; + } + /* set bus mastering for the device */ if (pci_vfio_set_bus_master(vfio_dev_fd, true)) { RTE_LOG(ERR, EAL, "Cannot set up bus mastering!\n"); -- 2.27.0