From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 73C98A0598; Sat, 18 Apr 2020 19:36:31 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 4F5141D572; Sat, 18 Apr 2020 19:36:16 +0200 (CEST) Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by dpdk.org (Postfix) with ESMTP id 4971B1D563 for ; Sat, 18 Apr 2020 19:36:13 +0200 (CEST) IronPort-SDR: 7r1ZWwXfWFR7PzE6Ip0eT4Q/bmwUraPs6PxBZ13gcz9qBrS2PO1FUTDCoSBO9kU0xd0T2V9Yff eimDA7lew48Q== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Apr 2020 10:36:13 -0700 IronPort-SDR: fhzs+Rm+aMGHU2OgJfmnm891BSSv6Qkj3XcLJ/z840CCVjwNiun1C43m+7zRxludLlK70QQKah ngVQXjoLZmew== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.72,399,1580803200"; d="scan'208";a="455066818" Received: from npg-dpdk-haiyue-3.sh.intel.com ([10.67.119.46]) by fmsmga005.fm.intel.com with ESMTP; 18 Apr 2020 10:36:10 -0700 From: Haiyue Wang To: dev@dpdk.org, anatoly.burakov@intel.com, thomas@monjalon.net, vattunuru@marvell.com, jerinj@marvell.com, alex.williamson@redhat.com, david.marchand@redhat.com Cc: Haiyue Wang Date: Sun, 19 Apr 2020 01:30:35 +0800 Message-Id: <20200418173035.8000-3-haiyue.wang@intel.com> X-Mailer: git-send-email 2.26.1 In-Reply-To: <20200418173035.8000-1-haiyue.wang@intel.com> References: <20200305043311.17065-1-vattunuru@marvell.com> <20200418173035.8000-1-haiyue.wang@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Subject: [dpdk-dev] [PATCH v8 2/2] eal: support for VFIO-PCI VF token X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" The kernel module vfio-pci introduces the VF token to enable SR-IOV support since 5.7. The VF token can be set by a vfio-pci based PF driver and must be known by the vfio-pci based VF driver in order to gain access to the device. Signed-off-by: Haiyue Wang Acked-by: Vamsi Attunuru Tested-by: Vamsi Attunuru --- doc/guides/linux_gsg/linux_drivers.rst | 38 ++++++++++++- doc/guides/rel_notes/release_20_05.rst | 5 ++ drivers/bus/pci/linux/pci_vfio.c | 74 +++++++++++++++++++++++++- lib/librte_eal/freebsd/eal.c | 3 +- lib/librte_eal/include/rte_vfio.h | 21 +++++++- lib/librte_eal/linux/eal_vfio.c | 20 +++++-- 6 files changed, 153 insertions(+), 8 deletions(-) diff --git a/doc/guides/linux_gsg/linux_drivers.rst b/doc/guides/linux_gsg/linux_drivers.rst index 238f3e900..3391723c9 100644 --- a/doc/guides/linux_gsg/linux_drivers.rst +++ b/doc/guides/linux_gsg/linux_drivers.rst @@ -72,11 +72,47 @@ Note that in order to use VFIO, your kernel must support it. VFIO kernel modules have been included in the Linux kernel since version 3.6.0 and are usually present by default, however please consult your distributions documentation to make sure that is the case. +The ``vfio-pci`` module since Linux version 5.7 supports the creation of virtual functions, this feature is disabled +by default. When enabled, the PF needs a shared VF token (UUID) to setup the trust between SR-IOV PF and VFs. The VF +token is any kind of valid UUID value selected by the user. When the PF device is bound to ``vfio-pci`` module, it should +not have any VFs created, this is consistent as before for security reason. + +Some use cases about how to use the VF token: + + - The user just uses PF only for DPDK, then no VF token is required to start the PF device. + + - The user wants to creat SR-IOV VFs on the PF device which is bound to ``vfio-pci`` module, then the user needs to select + a valid UUID type VF token to start the PF device; after the VFs are created, this VF token is also required to access each + VF device. + + - If the DPDK application that runs on PF device exits, and the user wants to start it with another different VF token + value, it will be OK if no application (DPDP or KVM) runs on VF, otherwise, it will fail to start with a kernel message + "[19145.688094] vfio-pci 0000:87:00.0: Incorrect VF token provided for device" shown. When all of the VFs are free, the + user can select a new VF token to start the PF device. + +The VFs created are bound to ``vfio-pci`` module automatically. DPDK will use the keyword ``vf_token`` as the device argument +to pass the VF token value to PF and its related VFs, the PMD should not use it, and this argument will be pruned from the +device argument list, so the PMD can parse its own valid device arguments successfully without seeing it. + +.. code-block:: console + + 1. sudo modprobe vfio-pci enable_sriov=1 + + 2. ./usertools/dpdk-devbind.py -b vfio-pci 0000:87:00.0 + + 3. echo 2 > /sys/bus/pci/devices/0000:87:00.0/sriov_numvfs + + 4. Start the PF: + ./x86_64-native-linux-gcc/app/testpmd -l 22-25 -n 4 -w 87:00.0,vf_token=2ab74924-c335-45f4-9b16-8569e5b08258 --file-prefix=pf -- -i + + 5. Start the VF: + ./x86_64-native-linux-gcc/app/testpmd -l 26-29 -n 4 -w 87:02.0,vf_token=2ab74924-c335-45f4-9b16-8569e5b08258 --file-prefix=vf0 -- -i + Also, to use VFIO, both kernel and BIOS must support and be configured to use IO virtualization (such as IntelĀ® VT-d). .. note:: - ``vfio-pci`` module doesn't support the creation of virtual functions. + ``vfio-pci`` module doesn't support the creation of virtual functions before Linux version 5.7. For proper operation of VFIO when running DPDK applications as a non-privileged user, correct permissions should also be set up. This can be done by using the DPDK setup script (called dpdk-setup.sh and located in the usertools directory). diff --git a/doc/guides/rel_notes/release_20_05.rst b/doc/guides/rel_notes/release_20_05.rst index 184967844..0a30f912b 100644 --- a/doc/guides/rel_notes/release_20_05.rst +++ b/doc/guides/rel_notes/release_20_05.rst @@ -81,6 +81,11 @@ New Features by making use of the event device capabilities. The event mode currently supports only inline IPsec protocol offload. +* **Added the support for vfio-pci new VF token interface.** + + Since Linux version 5.7, vfio-pci supports a shared VF token (UUID) to represent + the trust between SR-IOV PF and the created VFs. Update the method to gain access + to the device by appending the VF token. Removed Items ------------- diff --git a/drivers/bus/pci/linux/pci_vfio.c b/drivers/bus/pci/linux/pci_vfio.c index 64cd84a68..efb64e2ba 100644 --- a/drivers/bus/pci/linux/pci_vfio.c +++ b/drivers/bus/pci/linux/pci_vfio.c @@ -11,6 +11,7 @@ #include #include +#include #include #include #include @@ -644,12 +645,72 @@ pci_vfio_msix_is_mappable(int vfio_dev_fd, int msix_region) return ret; } +static int +vfio_pci_vf_token_arg(struct rte_devargs *devargs, rte_uuid_t uuid) +{ +#define VF_TOKEN_ARG "vf_token=" + char c, *p, *vf_token; + + memset(uuid, 0, sizeof(rte_uuid_t)); + + if (devargs == NULL) + return 0; + + p = strstr(devargs->args, VF_TOKEN_ARG); + if (!p) + return 0; + + vf_token = p + strlen(VF_TOKEN_ARG); + if (strlen(vf_token) < (RTE_UUID_STRLEN - 1)) { + RTE_LOG(ERR, EAL, "The VF token length is too short\n"); + return -1; + } + + c = vf_token[RTE_UUID_STRLEN - 1]; + if (c != '\0' && c != ',') { + RTE_LOG(ERR, EAL, + "The VF token ends with a invalid character : %c\n", c); + return -1; + } + + vf_token[RTE_UUID_STRLEN - 1] = '\0'; + if (rte_uuid_parse(vf_token, uuid)) { + RTE_LOG(ERR, EAL, + "The VF token is invalid : %s\n", vf_token); + vf_token[RTE_UUID_STRLEN - 1] = c; + return -1; + } + + RTE_LOG(DEBUG, EAL, + "The VF token is found : %s\n", vf_token); + + vf_token[RTE_UUID_STRLEN - 1] = c; + + /* This VF token will be treated as a invalid device argument if the + * PMD calls the rte_devargs parse API with its own valid argument list, + * so it needs to purge this vfio-pci specific argument. + */ + if (c != '\0') { + /* 1. Handle the case : 'vf_token=uuid,arg1=val1' */ + memmove(p, vf_token + RTE_UUID_STRLEN, + strlen(vf_token + RTE_UUID_STRLEN) + 1); + } else { + /* 2. Handle the case : 'arg1=val1,vf_token=uuid' */ + if (p != devargs->args) + p--; + + *p = '\0'; + } + + return 0; +} static int pci_vfio_map_resource_primary(struct rte_pci_device *dev) { struct vfio_device_info device_info = { .argsz = sizeof(device_info) }; char pci_addr[PATH_MAX] = {0}; + rte_uuid_t vf_token; int vfio_dev_fd; struct rte_pci_addr *loc = &dev->addr; int i, ret; @@ -668,8 +729,12 @@ pci_vfio_map_resource_primary(struct rte_pci_device *dev) snprintf(pci_addr, sizeof(pci_addr), PCI_PRI_FMT, loc->domain, loc->bus, loc->devid, loc->function); + ret = vfio_pci_vf_token_arg(dev->device.devargs, vf_token); + if (ret) + return ret; + ret = rte_vfio_setup_device(rte_pci_get_sysfs_path(), pci_addr, - &vfio_dev_fd, &device_info); + &vfio_dev_fd, &device_info, vf_token); if (ret) return ret; @@ -798,6 +863,7 @@ pci_vfio_map_resource_secondary(struct rte_pci_device *dev) { struct vfio_device_info device_info = { .argsz = sizeof(device_info) }; char pci_addr[PATH_MAX] = {0}; + rte_uuid_t vf_token; int vfio_dev_fd; struct rte_pci_addr *loc = &dev->addr; int i, ret; @@ -830,8 +896,12 @@ pci_vfio_map_resource_secondary(struct rte_pci_device *dev) return -1; } + ret = vfio_pci_vf_token_arg(dev->device.devargs, vf_token); + if (ret) + return ret; + ret = rte_vfio_setup_device(rte_pci_get_sysfs_path(), pci_addr, - &vfio_dev_fd, &device_info); + &vfio_dev_fd, &device_info, vf_token); if (ret) return ret; diff --git a/lib/librte_eal/freebsd/eal.c b/lib/librte_eal/freebsd/eal.c index 80dc9aa78..86d5a5f49 100644 --- a/lib/librte_eal/freebsd/eal.c +++ b/lib/librte_eal/freebsd/eal.c @@ -995,7 +995,8 @@ rte_eal_vfio_intr_mode(void) int rte_vfio_setup_device(__rte_unused const char *sysfs_base, __rte_unused const char *dev_addr, __rte_unused int *vfio_dev_fd, - __rte_unused struct vfio_device_info *device_info) + __rte_unused struct vfio_device_info *device_info, + __rte_unused rte_uuid_t vf_token) { return -1; } diff --git a/lib/librte_eal/include/rte_vfio.h b/lib/librte_eal/include/rte_vfio.h index 20ed8c45a..e5476ec6d 100644 --- a/lib/librte_eal/include/rte_vfio.h +++ b/lib/librte_eal/include/rte_vfio.h @@ -16,6 +16,8 @@ extern "C" { #include +#include + /* * determine if VFIO is present on the system */ @@ -102,13 +104,30 @@ struct vfio_device_info; * @param device_info * Device information. * + * @param vf_token + * Before linux 5.7, the PF bound to vfio-pci doesn't support SR-IOV to + * create VFs for security reason. Now the VF token is introduced to work + * as some degree of trust or collaboration between PF and VFs. + * + * A). as VF device, if the PF is a vfio device and it is bound to the + * vfio-pci driver, the user needs to provide a VF token to access the + * device, in the form of appending a vf_token to the device name, for + * example: + * "-w 04:10.0,vf_token=bd8d9d2b-5a5f-4f5a-a211-f591514ba1f3" + * + * B). as PF device, When presented with a PF which has VFs in use, the + * user must also provide the current VF token to prove collaboration with + * existing VF users. If VFs are not in use, the VF token provided for the + * PF device will act to set the VF token. + * * @return * 0 on success. * <0 on failure. * >1 if the device cannot be managed this way. */ int rte_vfio_setup_device(const char *sysfs_base, const char *dev_addr, - int *vfio_dev_fd, struct vfio_device_info *device_info); + int *vfio_dev_fd, struct vfio_device_info *device_info, + rte_uuid_t vf_token); /** * Release a device mapped to a VFIO-managed I/O MMU group. diff --git a/lib/librte_eal/linux/eal_vfio.c b/lib/librte_eal/linux/eal_vfio.c index 4502aefed..916082b5d 100644 --- a/lib/librte_eal/linux/eal_vfio.c +++ b/lib/librte_eal/linux/eal_vfio.c @@ -702,7 +702,8 @@ rte_vfio_clear_group(int vfio_group_fd) int rte_vfio_setup_device(const char *sysfs_base, const char *dev_addr, - int *vfio_dev_fd, struct vfio_device_info *device_info) + int *vfio_dev_fd, struct vfio_device_info *device_info, + rte_uuid_t vf_token) { struct vfio_group_status group_status = { .argsz = sizeof(group_status) @@ -712,6 +713,7 @@ rte_vfio_setup_device(const char *sysfs_base, const char *dev_addr, int vfio_container_fd; int vfio_group_fd; int iommu_group_num; + char dev[PATH_MAX]; int i, ret; /* get group number */ @@ -895,8 +897,19 @@ rte_vfio_setup_device(const char *sysfs_base, const char *dev_addr, t->type_id, t->name); } + if (!rte_uuid_is_null(vf_token)) { + char vf_token_str[RTE_UUID_STRLEN]; + + rte_uuid_unparse(vf_token, vf_token_str, sizeof(vf_token_str)); + snprintf(dev, sizeof(dev), + "%s vf_token=%s", dev_addr, vf_token_str); + } else { + snprintf(dev, sizeof(dev), + "%s", dev_addr); + } + /* get a file descriptor for the device */ - *vfio_dev_fd = ioctl(vfio_group_fd, VFIO_GROUP_GET_DEVICE_FD, dev_addr); + *vfio_dev_fd = ioctl(vfio_group_fd, VFIO_GROUP_GET_DEVICE_FD, dev); if (*vfio_dev_fd < 0) { /* if we cannot get a device fd, this implies a problem with * the VFIO group or the container not having IOMMU configured. @@ -2081,7 +2094,8 @@ int rte_vfio_setup_device(__rte_unused const char *sysfs_base, __rte_unused const char *dev_addr, __rte_unused int *vfio_dev_fd, - __rte_unused struct vfio_device_info *device_info) + __rte_unused struct vfio_device_info *device_info, + __rte_unused rte_uuid_t vf_token) { return -1; } -- 2.26.1