From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 65D3BA00C2; Wed, 22 Apr 2020 07:14:07 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 05DFC1D151; Wed, 22 Apr 2020 07:13:55 +0200 (CEST) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id 32D501BF59 for ; Wed, 22 Apr 2020 07:13:53 +0200 (CEST) IronPort-SDR: iSorZ+KJ12hIGIfD25a7d34hTkH2hwC42p1+3SaFIk/AieJRWNiYzv8hpz1jGkTkyXlpQ9rGRU s2zFhqFBudWA== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2020 22:13:52 -0700 IronPort-SDR: H3ZdyCCFKmbVRz2ZlLImR79Vk0RtpxEcEVaxcLFw4IXJDZ86PN6y0pPdNrobOgNpBqARX7gIOa 6otu7u9FTT2g== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.72,412,1580803200"; d="scan'208";a="279889695" Received: from npg-dpdk-haiyue-3.sh.intel.com ([10.67.119.46]) by fmsmga004.fm.intel.com with ESMTP; 21 Apr 2020 22:13:49 -0700 From: Haiyue Wang To: dev@dpdk.org, anatoly.burakov@intel.com, thomas@monjalon.net, vattunuru@marvell.com, jerinj@marvell.com, alex.williamson@redhat.com, david.marchand@redhat.com Cc: Haiyue Wang Date: Wed, 22 Apr 2020 13:08:04 +0800 Message-Id: <20200422050804.66781-3-haiyue.wang@intel.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200422050804.66781-1-haiyue.wang@intel.com> References: <20200305043311.17065-1-vattunuru@marvell.com> <20200422050804.66781-1-haiyue.wang@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Subject: [dpdk-dev] [PATCH v9 2/2] eal: support for VFIO-PCI VF token X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" The kernel module vfio-pci introduces the VF token to enable SR-IOV support since 5.7. The VF token can be set by a vfio-pci based PF driver and must be known by the vfio-pci based VF driver in order to gain access to the device. Signed-off-by: Haiyue Wang Acked-by: Vamsi Attunuru Tested-by: Vamsi Attunuru --- devtools/libabigail.abignore | 2 + doc/guides/linux_gsg/linux_drivers.rst | 41 +++++++++++++- doc/guides/rel_notes/release_20_05.rst | 5 ++ drivers/bus/pci/linux/pci_vfio.c | 74 +++++++++++++++++++++++++- lib/librte_eal/freebsd/eal.c | 3 +- lib/librte_eal/include/rte_vfio.h | 24 ++++++++- lib/librte_eal/linux/eal_vfio.c | 20 +++++-- 7 files changed, 161 insertions(+), 8 deletions(-) diff --git a/devtools/libabigail.abignore b/devtools/libabigail.abignore index cd86d89ca..01d987a1e 100644 --- a/devtools/libabigail.abignore +++ b/devtools/libabigail.abignore @@ -6,6 +6,8 @@ ; Explicit ignore for driver-only ABI [suppress_type] name = rte_cryptodev_ops +[suppress_function] + name = rte_vfio_setup_device ; Ignore this enum update as it is part of an experimental API [suppress_type] type_kind = enum diff --git a/doc/guides/linux_gsg/linux_drivers.rst b/doc/guides/linux_gsg/linux_drivers.rst index 238f3e900..b42fd708b 100644 --- a/doc/guides/linux_gsg/linux_drivers.rst +++ b/doc/guides/linux_gsg/linux_drivers.rst @@ -72,11 +72,50 @@ Note that in order to use VFIO, your kernel must support it. VFIO kernel modules have been included in the Linux kernel since version 3.6.0 and are usually present by default, however please consult your distributions documentation to make sure that is the case. +The ``vfio-pci`` module since Linux version 5.7 supports the creation of virtual +functions. After the PF is bound to vfio-pci module, the user can create the VFs +by sysfs interface, and these VFs are bound to vfio-pci module automatically. + +When the PF is bound to vfio-pci, it has initial VF token generated by random. For +security reason, this token is write only, the user can't read it from the kernel +directly. For accessing the VF, the user needs to start the PF with token parameter +to setup a VF token (uuid format), then the VF can be accessed with this new known +VF token. + +Also if the DPDK application running on the PF device exits, the user wants to start +the PF with another different VF token value, it has no issue if no application like +DPDK or KVM runs on VFs. Otherwise, the PF will fail to start until all VFs are free +to use, after that, the user can select a new VF token to start the PF device. + +DPDK will use the keyword ``vf_token`` as the device argument to pass the VF token +value to PF and its related VFs, the PMD should not use it, and this argument will +be pruned from the device argument list, so the PMD can parse its own valid device +arguments successfully. + +.. code-block:: console + + 1. Generate the VF token by uuid command + 14d63f20-8445-11ea-8900-1f9ce7d5650d + + 2. sudo modprobe vfio-pci enable_sriov=1 + + 2. ./usertools/dpdk-devbind.py -b vfio-pci 0000:86:00.0 + + 3. echo 2 > /sys/bus/pci/devices/0000:86:00.0/sriov_numvfs + + 4. Start the PF: + ./x86_64-native-linux-gcc/app/testpmd -l 22-25 -n 4 \ + -w 86:00.0,vf_token=14d63f20-8445-11ea-8900-1f9ce7d5650d --file-prefix=pf -- -i + + 5. Start the VF: + ./x86_64-native-linux-gcc/app/testpmd -l 26-29 -n 4 \ + -w 86:02.0,vf_token=14d63f20-8445-11ea-8900-1f9ce7d5650d --file-prefix=vf0 -- -i + Also, to use VFIO, both kernel and BIOS must support and be configured to use IO virtualization (such as IntelĀ® VT-d). .. note:: - ``vfio-pci`` module doesn't support the creation of virtual functions. + ``vfio-pci`` module doesn't support the creation of virtual functions before Linux version 5.7. For proper operation of VFIO when running DPDK applications as a non-privileged user, correct permissions should also be set up. This can be done by using the DPDK setup script (called dpdk-setup.sh and located in the usertools directory). diff --git a/doc/guides/rel_notes/release_20_05.rst b/doc/guides/rel_notes/release_20_05.rst index 709372e5e..9460e1eb2 100644 --- a/doc/guides/rel_notes/release_20_05.rst +++ b/doc/guides/rel_notes/release_20_05.rst @@ -97,6 +97,11 @@ New Features by making use of the event device capabilities. The event mode currently supports only inline IPsec protocol offload. +* **Added the support for vfio-pci new VF token interface.** + + Since Linux version 5.7, vfio-pci supports a shared VF token (UUID) to represent + the trust between SR-IOV PF and the created VFs. Update the method to gain access + to the device by appending the VF token. Removed Items ------------- diff --git a/drivers/bus/pci/linux/pci_vfio.c b/drivers/bus/pci/linux/pci_vfio.c index 64cd84a68..efb64e2ba 100644 --- a/drivers/bus/pci/linux/pci_vfio.c +++ b/drivers/bus/pci/linux/pci_vfio.c @@ -11,6 +11,7 @@ #include #include +#include #include #include #include @@ -644,12 +645,72 @@ pci_vfio_msix_is_mappable(int vfio_dev_fd, int msix_region) return ret; } +static int +vfio_pci_vf_token_arg(struct rte_devargs *devargs, rte_uuid_t uuid) +{ +#define VF_TOKEN_ARG "vf_token=" + char c, *p, *vf_token; + + memset(uuid, 0, sizeof(rte_uuid_t)); + + if (devargs == NULL) + return 0; + + p = strstr(devargs->args, VF_TOKEN_ARG); + if (!p) + return 0; + + vf_token = p + strlen(VF_TOKEN_ARG); + if (strlen(vf_token) < (RTE_UUID_STRLEN - 1)) { + RTE_LOG(ERR, EAL, "The VF token length is too short\n"); + return -1; + } + + c = vf_token[RTE_UUID_STRLEN - 1]; + if (c != '\0' && c != ',') { + RTE_LOG(ERR, EAL, + "The VF token ends with a invalid character : %c\n", c); + return -1; + } + + vf_token[RTE_UUID_STRLEN - 1] = '\0'; + if (rte_uuid_parse(vf_token, uuid)) { + RTE_LOG(ERR, EAL, + "The VF token is invalid : %s\n", vf_token); + vf_token[RTE_UUID_STRLEN - 1] = c; + return -1; + } + + RTE_LOG(DEBUG, EAL, + "The VF token is found : %s\n", vf_token); + + vf_token[RTE_UUID_STRLEN - 1] = c; + + /* This VF token will be treated as a invalid device argument if the + * PMD calls the rte_devargs parse API with its own valid argument list, + * so it needs to purge this vfio-pci specific argument. + */ + if (c != '\0') { + /* 1. Handle the case : 'vf_token=uuid,arg1=val1' */ + memmove(p, vf_token + RTE_UUID_STRLEN, + strlen(vf_token + RTE_UUID_STRLEN) + 1); + } else { + /* 2. Handle the case : 'arg1=val1,vf_token=uuid' */ + if (p != devargs->args) + p--; + + *p = '\0'; + } + + return 0; +} static int pci_vfio_map_resource_primary(struct rte_pci_device *dev) { struct vfio_device_info device_info = { .argsz = sizeof(device_info) }; char pci_addr[PATH_MAX] = {0}; + rte_uuid_t vf_token; int vfio_dev_fd; struct rte_pci_addr *loc = &dev->addr; int i, ret; @@ -668,8 +729,12 @@ pci_vfio_map_resource_primary(struct rte_pci_device *dev) snprintf(pci_addr, sizeof(pci_addr), PCI_PRI_FMT, loc->domain, loc->bus, loc->devid, loc->function); + ret = vfio_pci_vf_token_arg(dev->device.devargs, vf_token); + if (ret) + return ret; + ret = rte_vfio_setup_device(rte_pci_get_sysfs_path(), pci_addr, - &vfio_dev_fd, &device_info); + &vfio_dev_fd, &device_info, vf_token); if (ret) return ret; @@ -798,6 +863,7 @@ pci_vfio_map_resource_secondary(struct rte_pci_device *dev) { struct vfio_device_info device_info = { .argsz = sizeof(device_info) }; char pci_addr[PATH_MAX] = {0}; + rte_uuid_t vf_token; int vfio_dev_fd; struct rte_pci_addr *loc = &dev->addr; int i, ret; @@ -830,8 +896,12 @@ pci_vfio_map_resource_secondary(struct rte_pci_device *dev) return -1; } + ret = vfio_pci_vf_token_arg(dev->device.devargs, vf_token); + if (ret) + return ret; + ret = rte_vfio_setup_device(rte_pci_get_sysfs_path(), pci_addr, - &vfio_dev_fd, &device_info); + &vfio_dev_fd, &device_info, vf_token); if (ret) return ret; diff --git a/lib/librte_eal/freebsd/eal.c b/lib/librte_eal/freebsd/eal.c index 80dc9aa78..86d5a5f49 100644 --- a/lib/librte_eal/freebsd/eal.c +++ b/lib/librte_eal/freebsd/eal.c @@ -995,7 +995,8 @@ rte_eal_vfio_intr_mode(void) int rte_vfio_setup_device(__rte_unused const char *sysfs_base, __rte_unused const char *dev_addr, __rte_unused int *vfio_dev_fd, - __rte_unused struct vfio_device_info *device_info) + __rte_unused struct vfio_device_info *device_info, + __rte_unused rte_uuid_t vf_token) { return -1; } diff --git a/lib/librte_eal/include/rte_vfio.h b/lib/librte_eal/include/rte_vfio.h index 20ed8c45a..28d918cde 100644 --- a/lib/librte_eal/include/rte_vfio.h +++ b/lib/librte_eal/include/rte_vfio.h @@ -16,6 +16,8 @@ extern "C" { #include +#include + /* * determine if VFIO is present on the system */ @@ -102,13 +104,33 @@ struct vfio_device_info; * @param device_info * Device information. * + * @param vf_token + * Before linux 5.7, the PF bound to vfio-pci doesn't support SR-IOV to + * create VFs for security reason. Now the VF token is introduced to work + * as some degree of trust or collaboration between PF and VFs. + * + * A). as VF device, if the PF is a vfio device and it is bound to the + * vfio-pci driver, the user needs to provide a VF token to access the + * device, in the form of appending a vf_token to the device name, for + * example: + * "-w 04:10.0,vf_token=bd8d9d2b-5a5f-4f5a-a211-f591514ba1f3" + * + * B). as PF device, When presented with a PF which has VFs in use, the + * user must also provide the current VF token to prove collaboration with + * existing VF users. If VFs are not in use, the VF token provided for the + * PF device will act to set the VF token. + * + * The vf_token can be zero uuid, which will be ignored to pass into the + * vfio-pci module. + * * @return * 0 on success. * <0 on failure. * >1 if the device cannot be managed this way. */ int rte_vfio_setup_device(const char *sysfs_base, const char *dev_addr, - int *vfio_dev_fd, struct vfio_device_info *device_info); + int *vfio_dev_fd, struct vfio_device_info *device_info, + rte_uuid_t vf_token); /** * Release a device mapped to a VFIO-managed I/O MMU group. diff --git a/lib/librte_eal/linux/eal_vfio.c b/lib/librte_eal/linux/eal_vfio.c index d26e1649a..e8d7cbda5 100644 --- a/lib/librte_eal/linux/eal_vfio.c +++ b/lib/librte_eal/linux/eal_vfio.c @@ -702,7 +702,8 @@ rte_vfio_clear_group(int vfio_group_fd) int rte_vfio_setup_device(const char *sysfs_base, const char *dev_addr, - int *vfio_dev_fd, struct vfio_device_info *device_info) + int *vfio_dev_fd, struct vfio_device_info *device_info, + rte_uuid_t vf_token) { struct vfio_group_status group_status = { .argsz = sizeof(group_status) @@ -712,6 +713,7 @@ rte_vfio_setup_device(const char *sysfs_base, const char *dev_addr, int vfio_container_fd; int vfio_group_fd; int iommu_group_num; + char dev[PATH_MAX]; int i, ret; /* get group number */ @@ -895,8 +897,19 @@ rte_vfio_setup_device(const char *sysfs_base, const char *dev_addr, t->type_id, t->name); } + if (!rte_uuid_is_null(vf_token)) { + char vf_token_str[RTE_UUID_STRLEN]; + + rte_uuid_unparse(vf_token, vf_token_str, sizeof(vf_token_str)); + snprintf(dev, sizeof(dev), + "%s vf_token=%s", dev_addr, vf_token_str); + } else { + snprintf(dev, sizeof(dev), + "%s", dev_addr); + } + /* get a file descriptor for the device */ - *vfio_dev_fd = ioctl(vfio_group_fd, VFIO_GROUP_GET_DEVICE_FD, dev_addr); + *vfio_dev_fd = ioctl(vfio_group_fd, VFIO_GROUP_GET_DEVICE_FD, dev); if (*vfio_dev_fd < 0) { /* if we cannot get a device fd, this implies a problem with * the VFIO group or the container not having IOMMU configured. @@ -2083,7 +2096,8 @@ int rte_vfio_setup_device(__rte_unused const char *sysfs_base, __rte_unused const char *dev_addr, __rte_unused int *vfio_dev_fd, - __rte_unused struct vfio_device_info *device_info) + __rte_unused struct vfio_device_info *device_info, + __rte_unused rte_uuid_t vf_token) { return -1; } -- 2.26.2