DPDK patches and discussions
 help / color / Atom feed
* [dpdk-dev] [PATCH v1 1/1] kernel/linux: introduce vfio_pf kernel module
@ 2019-09-06  9:12 vattunuru
  2019-09-06  9:45 ` Thomas Monjalon
  2019-10-08 15:12 ` Stephen Hemminger
  0 siblings, 2 replies; 12+ messages in thread
From: vattunuru @ 2019-09-06  9:12 UTC (permalink / raw)
  To: dev; +Cc: thomas, jerinj, Vamsi Attunuru

From: Vamsi Attunuru <vattunuru@marvell.com>

The DPDK use case such as VF representer or OVS offload etc
would call for PF and VF PCIe devices to bind vfio-pci
module to enable IOMMU protection.

In addition to vSwitch use case, unlike, other PCI class of
devices, Network class of PCIe devices would have additional
responsibility on the PF devices such as promiscuous mode support
etc.

The above use cases demand VFIO needs bound to PF and its
VF devices. This is use case is not supported in Linux kernel,
due to a security issue where it is possible to have
DoS in case if VF attached to guest over vfio-pci and netdev
kernel driver runs on it and which something VF representer
would like to enable it.

Since we can not differentiate, the vfio-pci bounded VF devices
runs DPDK application or netdev driver in guest, we can not
introduce any scheme to fix DoS case and therefore not have
proper support of this in the upstream kernel.

The igb_uio enables such PF and VF binding support for
non-iommu devices to make VF representer or OVS offload
run on non-iommu devices with DoS vulnerability for netdev driver
as VF.

This kernel module, facilitate to enable SRIOV on PF devices,
therefore, to run both PF and VF devices in VFIO mode knowing
its impacts like igb_uio driver functions of non-iommu devices.

Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
 MAINTAINERS                       |   6 +
 config/common_linux               |   1 +
 doc/guides/prog_guide/index.rst   |   1 +
 doc/guides/prog_guide/vfio_pf.rst |  90 ++++++++++
 kernel/linux/Makefile             |   1 +
 kernel/linux/meson.build          |   2 +-
 kernel/linux/vfio_pf/Kbuild       |   2 +
 kernel/linux/vfio_pf/Makefile     |  24 +++
 kernel/linux/vfio_pf/meson.build  |  20 +++
 kernel/linux/vfio_pf/vfio_pf.c    | 360 ++++++++++++++++++++++++++++++++++++++
 10 files changed, 506 insertions(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 4100260..49178a5 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -288,6 +288,12 @@ M: Anatoly Burakov <anatoly.burakov@intel.com>
 F: lib/librte_eal/linux/eal/*vfio*
 F: drivers/bus/pci/linux/*vfio*
 
+Linux VFIO_PF
+M: Vamsi Attunuru <vattunuru@marvell.com>
+M: Jerin Jacob <jerinj@marvell.com>
+F: kernel/linux/vfio_pf/
+F: doc/guides/prog_guide/vfio_pf.rst
+
 FreeBSD EAL (with overlaps)
 M: Bruce Richardson <bruce.richardson@intel.com>
 F: lib/librte_eal/freebsd/Makefile
diff --git a/config/common_linux b/config/common_linux
index 6e25255..70f1e79 100644
--- a/config/common_linux
+++ b/config/common_linux
@@ -11,6 +11,7 @@ CONFIG_RTE_EAL_NUMA_AWARE_HUGEPAGES=y
 CONFIG_RTE_EAL_IGB_UIO=y
 CONFIG_RTE_EAL_VFIO=y
 CONFIG_RTE_KNI_KMOD=y
+CONFIG_RTE_VFIO_PF_KMOD=y
 CONFIG_RTE_LIBRTE_KNI=y
 CONFIG_RTE_LIBRTE_PMD_KNI=y
 CONFIG_RTE_LIBRTE_VHOST=y
diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index 692409a..601244c 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -43,6 +43,7 @@ Programmer's Guide
     pdump_lib
     multi_proc_support
     kernel_nic_interface
+    vfio_pf
     thread_safety_dpdk_functions
     eventdev
     event_ethernet_rx_adapter
diff --git a/doc/guides/prog_guide/vfio_pf.rst b/doc/guides/prog_guide/vfio_pf.rst
new file mode 100644
index 0000000..7e43cc7
--- /dev/null
+++ b/doc/guides/prog_guide/vfio_pf.rst
@@ -0,0 +1,90 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright(C) 2019 Marvell International Ltd.
+
+.. _vfio_pf:
+
+The DPDK VFIO_PF Kernel Module
+------------------------------
+
+The VFIO_PF kernel loadable module ``vfio_pf`` provides sysfs way of enabling
+PCI VF devices when the PCI PF is bound to VFIO driver.
+
+DPDK use case such as VF representer or OVS offload etc would call for PF
+and VF PCIe devices to bind vfio-pci module to enable IOMMU protection.
+
+In addition to vSwitch use case, unlike, other PCI class of devices, Network
+class of PCIe devices would have additional responsibility on the PF devices
+such as promiscuous mode support etc. The above use cases demand VFIO needs
+bound to PF and its VF devices.
+
+These use case is not supported in Linux kernel, due to a security issue
+where it is possible to have DoS in case if VF attached to guest over vfio-pci
+and netdev kernel driver runs on it and which something VF representer would
+like to enable it.
+
+The igb_uio enables such PF and VF binding support for non-iommu devices to
+make VF representer or OVS offload run on non-iommu devices with DoS
+vulnerability for netdev driver as VF.
+
+This kernel module facilitate to enable SRIOV on PF devices, therefore, to run
+both PF and VF devices in VFIO mode knowing its impacts like igb_uio driver
+functions of non-iommu devices.
+
+Example usage
+-------------
+
+Following example demonstrates enabling vfs on Marvell's Octeontx2 platform.
+RVU PF devices PF1 & PF2 are probed on BDFs "0002:02:00.0" and "0002:03:00.0"
+respectively. 2 VFs of each PF are enabled after followng the below procedure.
+
+.. code-block:: console
+
+    # echo "177d a063" >  /sys/bus/pci/drivers/vfio-pci/new_id
+    # echo 0002:02:00.0 > /sys/bus/pci/devices/0002:02:00.0/driver/unbind
+    # echo 0002:02:00.0 > /sys/bus/pci/drivers/vfio-pci/bind
+    # echo 0002:03:00.0 > /sys/bus/pci/devices/0002:03:00.0/driver/unbind
+    # echo 0002:03:00.0 > /sys/bus/pci/drivers/vfio-pci/bind
+
+    # lspci -k
+     0002:02:00.0 Ethernet controller: Cavium, Inc. Device a063 (rev 01)
+        Subsystem: Cavium, Inc. Device b200
+        Kernel driver in use: vfio-pci
+     0002:03:00.0 Ethernet controller: Cavium, Inc. Device a063 (rev 01)
+        Subsystem: Cavium, Inc. Device b200
+        Kernel driver in use: vfio-pci
+
+    # insmod build/kernel/linux/vfio_pf/vfio_pf.ko
+
+    # echo 0002:02:00.0 > /sys/vfio_pf/add_device
+    # echo 2 > /sys/vfio_pf/0002\:02\:00.0/num_vfs
+    # echo 0002:03:00.0 > /sys/vfio_pf/add_device
+    # echo 2 > /sys/vfio_pf/0002\:03\:00.0/num_vfs
+
+    # lspci -k
+     0002:02:00.0 Ethernet controller: Cavium, Inc. Device a063 (rev 01)
+        Subsystem: Cavium, Inc. Device b200
+        Kernel driver in use: vfio-pci
+     0002:02:00.1 Ethernet controller: Cavium, Inc. Device a064 (rev 01)
+        Subsystem: Cavium, Inc. Device b200
+     0002:02:00.2 Ethernet controller: Cavium, Inc. Device a064 (rev 01)
+        Subsystem: Cavium, Inc. Device b200
+     0002:03:00.0 Ethernet controller: Cavium, Inc. Device a063 (rev 01)
+        Subsystem: Cavium, Inc. Device b200
+        Kernel driver in use: vfio-pci
+     0002:03:00.1 Ethernet controller: Cavium, Inc. Device a064 (rev 01)
+        Subsystem: Cavium, Inc. Device b200
+     0002:03:00.2 Ethernet controller: Cavium, Inc. Device a064 (rev 01)
+        Subsystem: Cavium, Inc. Device b200
+
+    # echo 0 > /sys/vfio_pf/0002\:02\:00.0/num_vfs
+    # echo 0002:02:00.0 > /sys/vfio_pf/remove_device
+    # echo 0 > /sys/vfio_pf/0002\:03\:00.0/num_vfs
+    # echo 0002:03:00.0 > /sys/vfio_pf/remove_device
+
+    # rmmod build/kernel/linux/vfio_pf/vfio_pf.ko
+
+Prerequisite
+-------------
+
+PCI PF device needs to be bound to VFIO driver before enabling VFs using
+vfio_pf kernel module.
diff --git a/kernel/linux/Makefile b/kernel/linux/Makefile
index c2c45a3..ea202ec 100644
--- a/kernel/linux/Makefile
+++ b/kernel/linux/Makefile
@@ -5,5 +5,6 @@ include $(RTE_SDK)/mk/rte.vars.mk
 
 DIRS-$(CONFIG_RTE_EAL_IGB_UIO) += igb_uio
 DIRS-$(CONFIG_RTE_KNI_KMOD) += kni
+DIRS-$(CONFIG_RTE_VFIO_PF_KMOD) += vfio_pf
 
 include $(RTE_SDK)/mk/rte.subdir.mk
diff --git a/kernel/linux/meson.build b/kernel/linux/meson.build
index 1796cc6..50591dd 100644
--- a/kernel/linux/meson.build
+++ b/kernel/linux/meson.build
@@ -1,7 +1,7 @@
 # SPDX-License-Identifier: BSD-3-Clause
 # Copyright(c) 2018 Intel Corporation
 
-subdirs = ['igb_uio', 'kni']
+subdirs = ['igb_uio', 'kni', 'vfio_pf']
 
 # if we are cross-compiling we need kernel_dir specified
 if get_option('kernel_dir') == '' and meson.is_cross_build()
diff --git a/kernel/linux/vfio_pf/Kbuild b/kernel/linux/vfio_pf/Kbuild
new file mode 100644
index 0000000..18365c8
--- /dev/null
+++ b/kernel/linux/vfio_pf/Kbuild
@@ -0,0 +1,2 @@
+ccflags-y := $(MODULE_CFLAGS)
+obj-m := vfio_pf.o
diff --git a/kernel/linux/vfio_pf/Makefile b/kernel/linux/vfio_pf/Makefile
new file mode 100644
index 0000000..0ffbc6f
--- /dev/null
+++ b/kernel/linux/vfio_pf/Makefile
@@ -0,0 +1,24 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(C) 2019 Marvell International Ltd.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# module name and path
+#
+MODULE = vfio_pf
+
+#
+# CFLAGS
+#
+MODULE_CFLAGS += -I$(SRCDIR) --param max-inline-insns-single=100
+MODULE_CFLAGS += -I$(RTE_OUTPUT)/include
+MODULE_CFLAGS += -Winline -Wall -Werror
+MODULE_CFLAGS += -include $(RTE_OUTPUT)/include/rte_config.h
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-y := vfio_pf.c
+
+include $(RTE_SDK)/mk/rte.module.mk
diff --git a/kernel/linux/vfio_pf/meson.build b/kernel/linux/vfio_pf/meson.build
new file mode 100644
index 0000000..e485e4c
--- /dev/null
+++ b/kernel/linux/vfio_pf/meson.build
@@ -0,0 +1,20 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(C) 2019 Marvell International Ltd.
+
+mkfile = custom_target('vfio_pf_makefile',
+	output: 'Makefile',
+	command: ['touch', '@OUTPUT@'])
+
+custom_target('vfio_pf',
+	input: ['vfio_pf.c', 'Kbuild'],
+	output: 'vfio_pf.ko',
+	command: ['make', '-C', kernel_dir + '/build',
+		'M=' + meson.current_build_dir(),
+		'src=' + meson.current_source_dir(),
+		'EXTRA_CFLAGS=-I' + meson.current_source_dir() +
+			'/../../../lib/librte_eal/common/include',
+		'modules'],
+	depends: mkfile,
+	install: true,
+	install_dir: kernel_dir + '/extra/dpdk',
+	build_by_default: get_option('enable_kmods'))
diff --git a/kernel/linux/vfio_pf/vfio_pf.c b/kernel/linux/vfio_pf/vfio_pf.c
new file mode 100644
index 0000000..42f7f93
--- /dev/null
+++ b/kernel/linux/vfio_pf/vfio_pf.c
@@ -0,0 +1,360 @@
+/* SPDX-License-Identifier: GPL-2.0
+ * Copyright(C) 2019 Marvell International Ltd.
+ */
+
+#include <linux/device.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/pci.h>
+
+struct vfio_pf_group {
+	struct kobject *kobj;
+	struct kobj_attribute add_pf;
+	struct kobj_attribute remove_pf;
+	struct mutex lock;
+	struct list_head head;
+};
+
+static struct vfio_pf_group *pf_group;
+
+#define FMT_NVAL 4
+#define PCI_STR_SIZE sizeof("XXXXXXXX:XX:XX.X")
+
+struct pci_addr {
+	char name[PCI_STR_SIZE + 1];
+	uint32_t domain;
+	uint8_t bus;
+	uint8_t devid;
+	uint8_t function;
+};
+
+struct pf_obj {
+	struct list_head node;
+	struct pci_dev *pdev;
+	struct kobject *kobj;
+	struct kobj_attribute sysfs;
+	struct pci_addr paddr;
+};
+
+static int
+str_split(char *string, int stringlen,
+	  char **tokens, int maxtokens, char delim)
+{
+	int tokstart = 1;
+	int i, tok = 0;
+
+	if (string == NULL || tokens == NULL)
+		goto error;
+
+	for (i = 0; i < stringlen; i++) {
+		if (string[i] == '\0' || tok >= maxtokens)
+			break;
+		if (tokstart) {
+			tokstart = 0;
+			tokens[tok++] = &string[i];
+		}
+		if (string[i] == delim) {
+			string[i] = '\0';
+			tokstart = 1;
+		}
+	}
+	return tok;
+
+error:
+	return -1;
+}
+
+static int
+parse_pci_addr(const char *buf, int bufsize, struct pci_addr *paddr)
+{
+	union splitaddr {
+		struct {
+			char *domain;
+			char *bus;
+			char *devid;
+			char *function;
+		};
+		char *str[FMT_NVAL];
+	} splitaddr;
+
+	char *buf_copy = kstrndup(buf, bufsize, GFP_KERNEL);
+	if (buf_copy == NULL)
+		return -ENOMEM;
+
+	if (str_split(buf_copy, bufsize, splitaddr.str, FMT_NVAL, ':')
+			!= FMT_NVAL - 1)
+		goto error;
+	/* final split is on '.' between devid and function */
+	splitaddr.function = strchr(splitaddr.devid, '.');
+	if (splitaddr.function == NULL)
+		goto error;
+	*splitaddr.function++ = '\0';
+
+	if (kstrtou32(splitaddr.domain, 16, &paddr->domain) ||
+		kstrtou8(splitaddr.bus, 16, &paddr->bus) ||
+		kstrtou8(splitaddr.devid, 16, &paddr->devid) ||
+		kstrtou8(splitaddr.function, 10, &paddr->function))
+		goto error;
+
+	snprintf(paddr->name, sizeof(paddr->name), "%.4x:%.2x:%.2x.%.x",
+		 paddr->domain, paddr->bus, paddr->devid, paddr->function);
+
+	kfree(buf_copy);
+	return 0;
+error:
+	kfree(buf_copy);
+	return -EINVAL;
+}
+
+static ssize_t
+show_num_vfs(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
+{
+	struct pf_obj *obj;
+
+	obj = container_of(attr, struct pf_obj, sysfs);
+
+	return snprintf(buf, 10, "%u\n", pci_num_vf(obj->pdev));
+}
+
+static ssize_t
+store_num_vfs(struct kobject *kobj, struct kobj_attribute *attr,
+	      const char *buf, size_t count)
+{
+	struct pf_obj *obj;
+	int num_vfs, err = 0;
+
+	obj = container_of(attr, struct pf_obj, sysfs);
+
+	if (kstrtoint(buf, 0, &num_vfs)) {
+		pr_err("Invalid %s,  %s\n", attr->attr.name, buf);
+		err = -EIO;
+	}
+	if (num_vfs < 0) {
+		pr_err("Invalid %s,  %d < 0\n", attr->attr.name,
+			num_vfs);
+		err = -EIO;
+	}
+
+	if (num_vfs == 0)
+		pci_disable_sriov(obj->pdev);
+	else if (pci_num_vf(obj->pdev) == 0)
+		err = pci_enable_sriov(obj->pdev, num_vfs);
+	else
+		err = -EINVAL;
+
+	return err ? err : count;
+}
+
+static int
+pf_sysfs_create(char *name, struct pf_obj *obj)
+{
+	int err;
+
+	if (name == NULL || obj == NULL)
+		return -EINVAL;
+
+	obj->sysfs.show = show_num_vfs;
+	obj->sysfs.store = store_num_vfs;
+	obj->sysfs.attr.name = name;
+	obj->sysfs.attr.mode = 0644;
+
+	sysfs_attr_init(&obj->sysfs.attr);
+	err = sysfs_create_file(obj->kobj, &obj->sysfs.attr);
+	if (err) {
+		pr_err("Failed to create '%s' sysfs for '%s'\n",
+			name, kobject_name(obj->kobj));
+		return -EFAULT;
+	}
+
+	return 0;
+}
+
+static int
+probe_pf_dev(struct pf_obj *obj)
+{
+	struct pci_dev *pdev = NULL;
+	struct pci_addr *paddr;
+
+	paddr = &obj->paddr;
+
+	pdev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, pdev);
+	if (pdev == NULL)
+		return -ENODEV;
+
+	while (pdev) {
+		if ((paddr->domain == pci_domain_nr(pdev->bus)) &&
+		    (pdev->bus->number == paddr->bus) &&
+		    (PCI_SLOT(pdev->devfn) == paddr->devid) &&
+		    (PCI_FUNC(pdev->devfn) == paddr->function))
+			break;
+
+		pdev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, pdev);
+	};
+
+	if (pdev) {
+		obj->pdev = pdev;
+		return 0;
+	} else
+		return -ENODEV;
+}
+
+static ssize_t
+add_device(struct kobject *kobj, struct kobj_attribute *attr,
+	   const char *buf, size_t count)
+{
+	struct pf_obj *obj;
+	int err = 0;
+
+	obj = kzalloc(sizeof(struct pf_obj), GFP_KERNEL);
+	if (obj == NULL) {
+		err = -ENOMEM;
+		goto exit;
+	}
+
+	if (parse_pci_addr(buf, strlen(buf), &obj->paddr)) {
+		err = -EINVAL;
+		goto exit;
+	}
+
+	if (probe_pf_dev(obj)) {
+		err = -ENXIO;
+		goto exit;
+	}
+
+	obj->kobj = kobject_create_and_add(obj->paddr.name, pf_group->kobj);
+
+	if (pf_sysfs_create("num_vfs", obj)) {
+		pci_dev_put(obj->pdev);
+		pr_err("Failed to create the sysfs for pdev:%s\n",
+			obj->paddr.name);
+		return -EFAULT;
+	}
+
+	mutex_lock(&pf_group->lock);
+	list_add(&obj->node, &pf_group->head);
+	mutex_unlock(&pf_group->lock);
+
+exit:
+	if (err && obj)
+		kfree(obj);
+
+	return err ? err : count;
+}
+
+static void
+remove_pf_obj(struct pf_obj *obj)
+{
+	if (pci_num_vf(obj->pdev))
+		pci_disable_sriov(obj->pdev);
+	sysfs_remove_file(obj->kobj, &obj->sysfs.attr);
+	kobject_del(obj->kobj);
+	list_del(&obj->node);
+	pci_dev_put(obj->pdev);
+	kfree(obj);
+}
+
+static ssize_t
+remove_device(struct kobject *kobj, struct kobj_attribute *attr,
+	      const char *buf, size_t count)
+{
+	struct list_head *pos, *tmp;
+	struct pci_addr paddr;
+	struct pf_obj *obj;
+	int err = 0;
+
+	if (parse_pci_addr(buf, strlen(buf), &paddr)) {
+		err = -EINVAL;
+		goto exit;
+	}
+
+	list_for_each_safe(pos, tmp, &pf_group->head) {
+		obj = list_entry(pos, struct pf_obj, node);
+		if (!strncmp(obj->paddr.name, paddr.name, sizeof(paddr.name))) {
+			remove_pf_obj(obj);
+			break;
+		}
+	}
+
+exit:
+	return err ? err : count;
+}
+
+static void
+destroy_pf_objs(void)
+{
+	struct list_head *pos, *tmp;
+	struct pf_obj *obj;
+
+	list_for_each_safe(pos, tmp, &pf_group->head) {
+		obj = list_entry(pos, struct pf_obj, node);
+		remove_pf_obj(obj);
+	}
+}
+
+static void
+__exit vfio_pf_cleanup(void)
+{
+	if (pf_group == NULL)
+		return;
+
+	destroy_pf_objs();
+	sysfs_remove_file(pf_group->kobj, &pf_group->add_pf.attr);
+	sysfs_remove_file(pf_group->kobj, &pf_group->remove_pf.attr);
+	kobject_del(pf_group->kobj);
+	mutex_destroy(&pf_group->lock);
+	kfree(pf_group);
+}
+
+static int
+__init vfio_pf_init(void)
+{
+	int err;
+
+	pf_group = kzalloc(sizeof(struct vfio_pf_group), GFP_KERNEL);
+	if (pf_group == NULL)
+		return -ENOMEM;
+
+	pf_group->kobj = kobject_create_and_add("vfio_pf", NULL);
+	pf_group->add_pf.store = add_device;
+	pf_group->add_pf.attr.name = "add_device";
+	pf_group->add_pf.attr.mode = 0644;
+	pf_group->remove_pf.store = remove_device;
+	pf_group->remove_pf.attr.name = "remove_device";
+	pf_group->remove_pf.attr.mode = 0644;
+
+	sysfs_attr_init(&pf_group->add_pf.attr);
+	err = sysfs_create_file(pf_group->kobj, &pf_group->add_pf.attr);
+	if (err) {
+		pr_err("Failed to create sysfs '%s' for '%s'\n",
+			pf_group->add_pf.attr.name,
+			kobject_name(pf_group->kobj));
+		goto exit;
+	}
+
+	sysfs_attr_init(&pf_group->remove_pf.attr);
+	err = sysfs_create_file(pf_group->kobj, &pf_group->remove_pf.attr);
+	if (err) {
+		pr_err("Failed to create sysfs '%s' for '%s'\n",
+			pf_group->remove_pf.attr.name,
+			kobject_name(pf_group->kobj));
+		sysfs_remove_file(pf_group->kobj, &pf_group->add_pf.attr);
+		goto exit;
+	}
+
+	INIT_LIST_HEAD(&pf_group->head);
+
+	mutex_init(&pf_group->lock);
+
+	return 0;
+
+exit:
+	kfree(pf_group);
+	return -EFAULT;
+}
+
+module_init(vfio_pf_init);
+module_exit(vfio_pf_cleanup);
+
+MODULE_DESCRIPTION("Kernel module for enabling SRIOV");
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Marvell International Ltd");
-- 
2.8.4


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dpdk-dev] [PATCH v1 1/1] kernel/linux: introduce vfio_pf kernel module
  2019-09-06  9:12 [dpdk-dev] [PATCH v1 1/1] kernel/linux: introduce vfio_pf kernel module vattunuru
@ 2019-09-06  9:45 ` Thomas Monjalon
  2019-09-06 13:27   ` Jerin Jacob Kollanukkaran
  2019-10-08 15:12 ` Stephen Hemminger
  1 sibling, 1 reply; 12+ messages in thread
From: Thomas Monjalon @ 2019-09-06  9:45 UTC (permalink / raw)
  To: vattunuru; +Cc: dev, jerinj

06/09/2019 11:12, vattunuru@marvell.com:
> From: Vamsi Attunuru <vattunuru@marvell.com>
> 
> The DPDK use case such as VF representer or OVS offload etc
> would call for PF and VF PCIe devices to bind vfio-pci
> module to enable IOMMU protection.
> 
> In addition to vSwitch use case, unlike, other PCI class of
> devices, Network class of PCIe devices would have additional
> responsibility on the PF devices such as promiscuous mode support
> etc.
> 
> The above use cases demand VFIO needs bound to PF and its
> VF devices. This is use case is not supported in Linux kernel,
> due to a security issue where it is possible to have
> DoS in case if VF attached to guest over vfio-pci and netdev
> kernel driver runs on it and which something VF representer
> would like to enable it.
> 
> Since we can not differentiate, the vfio-pci bounded VF devices
> runs DPDK application or netdev driver in guest, we can not
> introduce any scheme to fix DoS case and therefore not have
> proper support of this in the upstream kernel.
> 
> The igb_uio enables such PF and VF binding support for
> non-iommu devices to make VF representer or OVS offload
> run on non-iommu devices with DoS vulnerability for netdev driver
> as VF.
> 
> This kernel module, facilitate to enable SRIOV on PF devices,
> therefore, to run both PF and VF devices in VFIO mode knowing
> its impacts like igb_uio driver functions of non-iommu devices.
> 
> Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
> Signed-off-by: Jerin Jacob <jerinj@marvell.com>

Sorry I fail to properly understand the explanation above.
Please try to split in shorter sentences.

About the request to add an out-of-tree Linux kernel driver,
I guess Jerin is well aware that we don't want such anymore.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dpdk-dev] [PATCH v1 1/1] kernel/linux: introduce vfio_pf kernel module
  2019-09-06  9:45 ` Thomas Monjalon
@ 2019-09-06 13:27   ` Jerin Jacob Kollanukkaran
  2019-09-25  4:06     ` Vamsi Krishna Attunuru
  2019-10-08  5:07     ` Vamsi Krishna Attunuru
  0 siblings, 2 replies; 12+ messages in thread
From: Jerin Jacob Kollanukkaran @ 2019-09-06 13:27 UTC (permalink / raw)
  To: Thomas Monjalon, Vamsi Krishna Attunuru; +Cc: dev

> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Friday, September 6, 2019 3:15 PM
> To: Vamsi Krishna Attunuru <vattunuru@marvell.com>
> Cc: dev@dpdk.org; Jerin Jacob Kollanukkaran <jerinj@marvell.com>
> Subject: Re: [dpdk-dev] [PATCH v1 1/1] kernel/linux: introduce vfio_pf kernel
> module
> 
> 06/09/2019 11:12, vattunuru@marvell.com:
> > From: Vamsi Attunuru <vattunuru@marvell.com>
> >
> > The DPDK use case such as VF representer or OVS offload etc would call
> > for PF and VF PCIe devices to bind vfio-pci module to enable IOMMU
> > protection.
> >
> > In addition to vSwitch use case, unlike, other PCI class of devices,
> > Network class of PCIe devices would have additional responsibility on
> > the PF devices such as promiscuous mode support etc.
> >
> > The above use cases demand VFIO needs bound to PF and its VF devices.
> > This is use case is not supported in Linux kernel, due to a security
> > issue where it is possible to have DoS in case if VF attached to guest
> > over vfio-pci and netdev kernel driver runs on it and which something
> > VF representer would like to enable it.
> >
> > Since we can not differentiate, the vfio-pci bounded VF devices runs
> > DPDK application or netdev driver in guest, we can not introduce any
> > scheme to fix DoS case and therefore not have proper support of this
> > in the upstream kernel.
> >
> > The igb_uio enables such PF and VF binding support for non-iommu
> > devices to make VF representer or OVS offload run on non-iommu devices
> > with DoS vulnerability for netdev driver as VF.
> >
> > This kernel module, facilitate to enable SRIOV on PF devices,
> > therefore, to run both PF and VF devices in VFIO mode knowing its
> > impacts like igb_uio driver functions of non-iommu devices.
> >
> > Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
> > Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> 
> Sorry I fail to properly understand the explanation above.
> Please try to split in shorter sentences.
> 
> About the request to add an out-of-tree Linux kernel driver, I guess Jerin is well
> aware that we don't want such anymore.

Yes. I am aware of it. I don't like the out of tree modules either. But, This case,
I suggested Vamsi to have out of tree module.

Let me describe the issue and let us discuss how to tackle the  problem:

# Linux kernel wont allow VFIO PF to have SRIOV enable.

Patches and on going discussion are here:
https://patchwork.kernel.org/patch/10522381/
https://lwn.net/Articles/748526/

Based on my understanding the reason for NOT allowing the
VFIO PF to have SRIOV enable is genuine from kernel point of
View but not from DPDK point of view.

Here is the sequence  to describe the problem
1) Consider Linux kernel allowed VFIO PCI SRIOV enable
2) PF bound to vfio-pci
3) using SRIOV infrastructure of vfio-pci  PF driver,
VFs  are created
4) DPDK application bound to PF and VF, No issue here.
5) Assume DPDK application bound to PF and VF bound
To netdev kernel driver. Now, there is a genuine  concern
From kernel point of view that, DPDK PF can intercept,
VF mailbox message or so and deny the Kernel request
Or what if DPDK PF application crashes?

To avoid the case (5), (3) is not allowed in stock kernel.
Which makes sense IMO.

Now, From DPDK PoV, step 5 is valid as we have
Rte_flow's VF action etc used to enable such case.
Where, user can program the PF's rte_flow to steer
Some traffic to VF, where VF can be, DPDK application or
Linux kernel netdev driver.

This patch enables the step (3) to enable step (5) from DPDK
PoV. i.e DPDK needs to allow PF to bind to DPDK with VFs.

Why this issue now:
- igb_uio kernel driver is used as enabling step (3)
See store_max_vfs() kernel/linux/igb_uio/igb_uio.c
 This is fine for non-iommu device, IOMMU devices
needs VFIO.
- We would like support VFIO for IOMMU protection
And enable step (5) as DPDK supports form the spec level.
i.e need to fix feature disparity between iommu vs
non-iommu based devices.

Note:
We may not need a  brand new kernel module, we could move
this logic to igb_uio if maintenance is concern.







 


 

 




- 
 






> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dpdk-dev] [PATCH v1 1/1] kernel/linux: introduce vfio_pf kernel module
  2019-09-06 13:27   ` Jerin Jacob Kollanukkaran
@ 2019-09-25  4:06     ` Vamsi Krishna Attunuru
  2019-09-25  7:18       ` Andrew Rybchenko
  2019-10-08  5:07     ` Vamsi Krishna Attunuru
  1 sibling, 1 reply; 12+ messages in thread
From: Vamsi Krishna Attunuru @ 2019-09-25  4:06 UTC (permalink / raw)
  To: Jerin Jacob Kollanukkaran, Thomas Monjalon; +Cc: dev

Hi,

We would like to have these support in 19.11. Please let us know if there are any comments or suggestions, we can discuss and address accordingly.

-----Original Message-----
From: Jerin Jacob Kollanukkaran <jerinj@marvell.com> 
Sent: Friday, September 6, 2019 6:58 PM
To: Thomas Monjalon <thomas@monjalon.net>; Vamsi Krishna Attunuru <vattunuru@marvell.com>
Cc: dev@dpdk.org
Subject: RE: [dpdk-dev] [PATCH v1 1/1] kernel/linux: introduce vfio_pf kernel module

> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Friday, September 6, 2019 3:15 PM
> To: Vamsi Krishna Attunuru <vattunuru@marvell.com>
> Cc: dev@dpdk.org; Jerin Jacob Kollanukkaran <jerinj@marvell.com>
> Subject: Re: [dpdk-dev] [PATCH v1 1/1] kernel/linux: introduce vfio_pf 
> kernel module
> 
> 06/09/2019 11:12, vattunuru@marvell.com:
> > From: Vamsi Attunuru <vattunuru@marvell.com>
> >
> > The DPDK use case such as VF representer or OVS offload etc would 
> > call for PF and VF PCIe devices to bind vfio-pci module to enable 
> > IOMMU protection.
> >
> > In addition to vSwitch use case, unlike, other PCI class of devices, 
> > Network class of PCIe devices would have additional responsibility 
> > on the PF devices such as promiscuous mode support etc.
> >
> > The above use cases demand VFIO needs bound to PF and its VF devices.
> > This is use case is not supported in Linux kernel, due to a security 
> > issue where it is possible to have DoS in case if VF attached to 
> > guest over vfio-pci and netdev kernel driver runs on it and which 
> > something VF representer would like to enable it.
> >
> > Since we can not differentiate, the vfio-pci bounded VF devices runs 
> > DPDK application or netdev driver in guest, we can not introduce any 
> > scheme to fix DoS case and therefore not have proper support of this 
> > in the upstream kernel.
> >
> > The igb_uio enables such PF and VF binding support for non-iommu 
> > devices to make VF representer or OVS offload run on non-iommu 
> > devices with DoS vulnerability for netdev driver as VF.
> >
> > This kernel module, facilitate to enable SRIOV on PF devices, 
> > therefore, to run both PF and VF devices in VFIO mode knowing its 
> > impacts like igb_uio driver functions of non-iommu devices.
> >
> > Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
> > Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> 
> Sorry I fail to properly understand the explanation above.
> Please try to split in shorter sentences.
> 
> About the request to add an out-of-tree Linux kernel driver, I guess 
> Jerin is well aware that we don't want such anymore.

Yes. I am aware of it. I don't like the out of tree modules either. But, This case, I suggested Vamsi to have out of tree module.

Let me describe the issue and let us discuss how to tackle the  problem:

# Linux kernel wont allow VFIO PF to have SRIOV enable.

Patches and on going discussion are here:
https://patchwork.kernel.org/patch/10522381/
https://lwn.net/Articles/748526/

Based on my understanding the reason for NOT allowing the VFIO PF to have SRIOV enable is genuine from kernel point of View but not from DPDK point of view.

Here is the sequence  to describe the problem
1) Consider Linux kernel allowed VFIO PCI SRIOV enable
2) PF bound to vfio-pci
3) using SRIOV infrastructure of vfio-pci  PF driver, VFs  are created
4) DPDK application bound to PF and VF, No issue here.
5) Assume DPDK application bound to PF and VF bound To netdev kernel driver. Now, there is a genuine  concern From kernel point of view that, DPDK PF can intercept, VF mailbox message or so and deny the Kernel request Or what if DPDK PF application crashes?

To avoid the case (5), (3) is not allowed in stock kernel.
Which makes sense IMO.

Now, From DPDK PoV, step 5 is valid as we have Rte_flow's VF action etc used to enable such case.
Where, user can program the PF's rte_flow to steer Some traffic to VF, where VF can be, DPDK application or Linux kernel netdev driver.

This patch enables the step (3) to enable step (5) from DPDK PoV. i.e DPDK needs to allow PF to bind to DPDK with VFs.

Why this issue now:
- igb_uio kernel driver is used as enabling step (3) See store_max_vfs() kernel/linux/igb_uio/igb_uio.c  This is fine for non-iommu device, IOMMU devices needs VFIO.
- We would like support VFIO for IOMMU protection And enable step (5) as DPDK supports form the spec level.
i.e need to fix feature disparity between iommu vs non-iommu based devices.

Note:
We may not need a  brand new kernel module, we could move this logic to igb_uio if maintenance is concern.







 


 

 




- 
 






> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dpdk-dev] [PATCH v1 1/1] kernel/linux: introduce vfio_pf kernel module
  2019-09-25  4:06     ` Vamsi Krishna Attunuru
@ 2019-09-25  7:18       ` Andrew Rybchenko
  0 siblings, 0 replies; 12+ messages in thread
From: Andrew Rybchenko @ 2019-09-25  7:18 UTC (permalink / raw)
  To: Vamsi Krishna Attunuru, Jerin Jacob Kollanukkaran, Thomas Monjalon; +Cc: dev

On 9/25/19 7:06 AM, Vamsi Krishna Attunuru wrote:
> Hi,
>
> We would like to have these support in 19.11. Please let us know if there are any comments or suggestions, we can discuss and address accordingly.
>
> -----Original Message-----
> From: Jerin Jacob Kollanukkaran <jerinj@marvell.com>
> Sent: Friday, September 6, 2019 6:58 PM
> To: Thomas Monjalon <thomas@monjalon.net>; Vamsi Krishna Attunuru <vattunuru@marvell.com>
> Cc: dev@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v1 1/1] kernel/linux: introduce vfio_pf kernel module
>
>> -----Original Message-----
>> From: Thomas Monjalon <thomas@monjalon.net>
>> Sent: Friday, September 6, 2019 3:15 PM
>> To: Vamsi Krishna Attunuru <vattunuru@marvell.com>
>> Cc: dev@dpdk.org; Jerin Jacob Kollanukkaran <jerinj@marvell.com>
>> Subject: Re: [dpdk-dev] [PATCH v1 1/1] kernel/linux: introduce vfio_pf
>> kernel module
>>
>> 06/09/2019 11:12, vattunuru@marvell.com:
>>> From: Vamsi Attunuru <vattunuru@marvell.com>
>>>
>>> The DPDK use case such as VF representer or OVS offload etc would
>>> call for PF and VF PCIe devices to bind vfio-pci module to enable
>>> IOMMU protection.
>>>
>>> In addition to vSwitch use case, unlike, other PCI class of devices,
>>> Network class of PCIe devices would have additional responsibility
>>> on the PF devices such as promiscuous mode support etc.
>>>
>>> The above use cases demand VFIO needs bound to PF and its VF devices.
>>> This is use case is not supported in Linux kernel, due to a security
>>> issue where it is possible to have DoS in case if VF attached to
>>> guest over vfio-pci and netdev kernel driver runs on it and which
>>> something VF representer would like to enable it.
>>>
>>> Since we can not differentiate, the vfio-pci bounded VF devices runs
>>> DPDK application or netdev driver in guest, we can not introduce any
>>> scheme to fix DoS case and therefore not have proper support of this
>>> in the upstream kernel.
>>>
>>> The igb_uio enables such PF and VF binding support for non-iommu
>>> devices to make VF representer or OVS offload run on non-iommu
>>> devices with DoS vulnerability for netdev driver as VF.
>>>
>>> This kernel module, facilitate to enable SRIOV on PF devices,
>>> therefore, to run both PF and VF devices in VFIO mode knowing its
>>> impacts like igb_uio driver functions of non-iommu devices.
>>>
>>> Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
>>> Signed-off-by: Jerin Jacob <jerinj@marvell.com>
>> Sorry I fail to properly understand the explanation above.
>> Please try to split in shorter sentences.
>>
>> About the request to add an out-of-tree Linux kernel driver, I guess
>> Jerin is well aware that we don't want such anymore.
> Yes. I am aware of it. I don't like the out of tree modules either. But, This case, I suggested Vamsi to have out of tree module.
>
> Let me describe the issue and let us discuss how to tackle the  problem:
>
> # Linux kernel wont allow VFIO PF to have SRIOV enable.
>
> Patches and on going discussion are here:
> https://patchwork.kernel.org/patch/10522381/
> https://lwn.net/Articles/748526/
>
> Based on my understanding the reason for NOT allowing the VFIO PF to have SRIOV enable is genuine from kernel point of View but not from DPDK point of view.
>
> Here is the sequence  to describe the problem
> 1) Consider Linux kernel allowed VFIO PCI SRIOV enable
> 2) PF bound to vfio-pci
> 3) using SRIOV infrastructure of vfio-pci  PF driver, VFs  are created
> 4) DPDK application bound to PF and VF, No issue here.
> 5) Assume DPDK application bound to PF and VF bound To netdev kernel driver. Now, there is a genuine  concern From kernel point of view that, DPDK PF can intercept, VF mailbox message or so and deny the Kernel request Or what if DPDK PF application crashes?
>
> To avoid the case (5), (3) is not allowed in stock kernel.
> Which makes sense IMO.
>
> Now, From DPDK PoV, step 5 is valid as we have Rte_flow's VF action etc used to enable such case.
> Where, user can program the PF's rte_flow to steer Some traffic to VF, where VF can be, DPDK application or Linux kernel netdev driver.
>
> This patch enables the step (3) to enable step (5) from DPDK PoV. i.e DPDK needs to allow PF to bind to DPDK with VFs.
>
> Why this issue now:
> - igb_uio kernel driver is used as enabling step (3) See store_max_vfs() kernel/linux/igb_uio/igb_uio.c  This is fine for non-iommu device, IOMMU devices needs VFIO.
> - We would like support VFIO for IOMMU protection And enable step (5) as DPDK supports form the spec level.
> i.e need to fix feature disparity between iommu vs non-iommu based devices.
>
> Note:
> We may not need a  brand new kernel module, we could move this logic to igb_uio if maintenance is concern.

It is really a problem for not bifurcated drivers and I agree with Jerin 
above.

One driver could be better, but I'm not sure and I'm afraid it will make too
many changes in igb_uio and result in instabilities/bugs. I have no strong
opinion if we can take the risk with igb_uio - simply have no enough
information if it is widely used or Linux in-kernel drivers are preferred.

Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dpdk-dev] [PATCH v1 1/1] kernel/linux: introduce vfio_pf kernel module
  2019-09-06 13:27   ` Jerin Jacob Kollanukkaran
  2019-09-25  4:06     ` Vamsi Krishna Attunuru
@ 2019-10-08  5:07     ` Vamsi Krishna Attunuru
  1 sibling, 0 replies; 12+ messages in thread
From: Vamsi Krishna Attunuru @ 2019-10-08  5:07 UTC (permalink / raw)
  To: Jerin Jacob Kollanukkaran, Thomas Monjalon; +Cc: dev



> -----Original Message-----
> From: Jerin Jacob Kollanukkaran <jerinj@marvell.com>
> Sent: Friday, September 6, 2019 6:58 PM
> To: Thomas Monjalon <thomas@monjalon.net>; Vamsi Krishna Attunuru
> <vattunuru@marvell.com>
> Cc: dev@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v1 1/1] kernel/linux: introduce vfio_pf kernel
> module
> 
> > -----Original Message-----
> > From: Thomas Monjalon <thomas@monjalon.net>
> > Sent: Friday, September 6, 2019 3:15 PM
> > To: Vamsi Krishna Attunuru <vattunuru@marvell.com>
> > Cc: dev@dpdk.org; Jerin Jacob Kollanukkaran <jerinj@marvell.com>
> > Subject: Re: [dpdk-dev] [PATCH v1 1/1] kernel/linux: introduce vfio_pf
> > kernel module
> >
> > 06/09/2019 11:12, vattunuru@marvell.com:
> > > From: Vamsi Attunuru <vattunuru@marvell.com>
> > >
> > > The DPDK use case such as VF representer or OVS offload etc would
> > > call for PF and VF PCIe devices to bind vfio-pci module to enable
> > > IOMMU protection.
> > >
> > > In addition to vSwitch use case, unlike, other PCI class of devices,
> > > Network class of PCIe devices would have additional responsibility
> > > on the PF devices such as promiscuous mode support etc.
> > >
> > > The above use cases demand VFIO needs bound to PF and its VF devices.
> > > This is use case is not supported in Linux kernel, due to a security
> > > issue where it is possible to have DoS in case if VF attached to
> > > guest over vfio-pci and netdev kernel driver runs on it and which
> > > something VF representer would like to enable it.
> > >
> > > Since we can not differentiate, the vfio-pci bounded VF devices runs
> > > DPDK application or netdev driver in guest, we can not introduce any
> > > scheme to fix DoS case and therefore not have proper support of this
> > > in the upstream kernel.
> > >
> > > The igb_uio enables such PF and VF binding support for non-iommu
> > > devices to make VF representer or OVS offload run on non-iommu
> > > devices with DoS vulnerability for netdev driver as VF.
> > >
> > > This kernel module, facilitate to enable SRIOV on PF devices,
> > > therefore, to run both PF and VF devices in VFIO mode knowing its
> > > impacts like igb_uio driver functions of non-iommu devices.
> > >
> > > Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
> > > Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> >
> > Sorry I fail to properly understand the explanation above.
> > Please try to split in shorter sentences.
> >
> > About the request to add an out-of-tree Linux kernel driver, I guess
> > Jerin is well aware that we don't want such anymore.
> 
> Yes. I am aware of it. I don't like the out of tree modules either. But, This
> case, I suggested Vamsi to have out of tree module.
> 
> Let me describe the issue and let us discuss how to tackle the  problem:
> 
> # Linux kernel wont allow VFIO PF to have SRIOV enable.
> 
> Patches and on going discussion are here:
> https://patchwork.kernel.org/patch/10522381/
> https://lwn.net/Articles/748526/
> 
> Based on my understanding the reason for NOT allowing the VFIO PF to have
> SRIOV enable is genuine from kernel point of View but not from DPDK point
> of view.
> 
> Here is the sequence  to describe the problem
> 1) Consider Linux kernel allowed VFIO PCI SRIOV enable
> 2) PF bound to vfio-pci
> 3) using SRIOV infrastructure of vfio-pci  PF driver, VFs  are created
> 4) DPDK application bound to PF and VF, No issue here.
> 5) Assume DPDK application bound to PF and VF bound To netdev kernel
> driver. Now, there is a genuine  concern From kernel point of view that, DPDK
> PF can intercept, VF mailbox message or so and deny the Kernel request Or
> what if DPDK PF application crashes?
> 
> To avoid the case (5), (3) is not allowed in stock kernel.
> Which makes sense IMO.
> 
> Now, From DPDK PoV, step 5 is valid as we have Rte_flow's VF action etc
> used to enable such case.
> Where, user can program the PF's rte_flow to steer Some traffic to VF, where
> VF can be, DPDK application or Linux kernel netdev driver.
> 
> This patch enables the step (3) to enable step (5) from DPDK PoV. i.e DPDK
> needs to allow PF to bind to DPDK with VFs.
> 
> Why this issue now:
> - igb_uio kernel driver is used as enabling step (3) See store_max_vfs()
> kernel/linux/igb_uio/igb_uio.c  This is fine for non-iommu device, IOMMU
> devices needs VFIO.
> - We would like support VFIO for IOMMU protection And enable step (5) as
> DPDK supports form the spec level.
> i.e need to fix feature disparity between iommu vs non-iommu based
> devices.
> 
> Note:
> We may not need a  brand new kernel module, we could move this logic to
> igb_uio if maintenance is concern.
> 

@All, we are expecting to merge this in 19.11 release and if any one have comments please respond.

> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> -
> 
> 
> 
> 
> 
> 
> 
> >


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dpdk-dev] [PATCH v1 1/1] kernel/linux: introduce vfio_pf kernel module
  2019-09-06  9:12 [dpdk-dev] [PATCH v1 1/1] kernel/linux: introduce vfio_pf kernel module vattunuru
  2019-09-06  9:45 ` Thomas Monjalon
@ 2019-10-08 15:12 ` Stephen Hemminger
  2019-10-08 15:28   ` Jerin Jacob
  1 sibling, 1 reply; 12+ messages in thread
From: Stephen Hemminger @ 2019-10-08 15:12 UTC (permalink / raw)
  To: vattunuru; +Cc: dev, thomas, jerinj

On Fri, 6 Sep 2019 14:42:30 +0530
<vattunuru@marvell.com> wrote:

> From: Vamsi Attunuru <vattunuru@marvell.com>
> 
> The DPDK use case such as VF representer or OVS offload etc
> would call for PF and VF PCIe devices to bind vfio-pci
> module to enable IOMMU protection.
> 
> In addition to vSwitch use case, unlike, other PCI class of
> devices, Network class of PCIe devices would have additional
> responsibility on the PF devices such as promiscuous mode support
> etc.
> 
> The above use cases demand VFIO needs bound to PF and its
> VF devices. This is use case is not supported in Linux kernel,
> due to a security issue where it is possible to have
> DoS in case if VF attached to guest over vfio-pci and netdev
> kernel driver runs on it and which something VF representer
> would like to enable it.
> 
> Since we can not differentiate, the vfio-pci bounded VF devices
> runs DPDK application or netdev driver in guest, we can not
> introduce any scheme to fix DoS case and therefore not have
> proper support of this in the upstream kernel.
> 
> The igb_uio enables such PF and VF binding support for
> non-iommu devices to make VF representer or OVS offload
> run on non-iommu devices with DoS vulnerability for netdev driver
> as VF.
> 
> This kernel module, facilitate to enable SRIOV on PF devices,
> therefore, to run both PF and VF devices in VFIO mode knowing
> its impacts like igb_uio driver functions of non-iommu devices.
> 
> Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
> Signed-off-by: Jerin Jacob <jerinj@marvell.com>

NAK
Having kernel drivers  not in upstream kernel is a long term
maintenance and security risk. Please work with upstream kernel
developers to get this merged there.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dpdk-dev] [PATCH v1 1/1] kernel/linux: introduce vfio_pf kernel module
  2019-10-08 15:12 ` Stephen Hemminger
@ 2019-10-08 15:28   ` Jerin Jacob
  2019-10-09 23:28     ` Stephen Hemminger
  0 siblings, 1 reply; 12+ messages in thread
From: Jerin Jacob @ 2019-10-08 15:28 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Vamsi Attunuru, dpdk-dev, thomas, jerinj

On Tue, 8 Oct, 2019, 8:42 PM Stephen Hemminger, <stephen@networkplumber.org>
wrote:

> On Fri, 6 Sep 2019 14:42:30 +0530
> <vattunuru@marvell.com> wrote:
>
> > From: Vamsi Attunuru <vattunuru@marvell.com>
> >
> > The DPDK use case such as VF representer or OVS offload etc
> > would call for PF and VF PCIe devices to bind vfio-pci
> > module to enable IOMMU protection.
> >
> > In addition to vSwitch use case, unlike, other PCI class of
> > devices, Network class of PCIe devices would have additional
> > responsibility on the PF devices such as promiscuous mode support
> > etc.
> >
> > The above use cases demand VFIO needs bound to PF and its
> > VF devices. This is use case is not supported in Linux kernel,
> > due to a security issue where it is possible to have
> > DoS in case if VF attached to guest over vfio-pci and netdev
> > kernel driver runs on it and which something VF representer
> > would like to enable it.
> >
> > Since we can not differentiate, the vfio-pci bounded VF devices
> > runs DPDK application or netdev driver in guest, we can not
> > introduce any scheme to fix DoS case and therefore not have
> > proper support of this in the upstream kernel.
> >
> > The igb_uio enables such PF and VF binding support for
> > non-iommu devices to make VF representer or OVS offload
> > run on non-iommu devices with DoS vulnerability for netdev driver
> > as VF.
> >
> > This kernel module, facilitate to enable SRIOV on PF devices,
> > therefore, to run both PF and VF devices in VFIO mode knowing
> > its impacts like igb_uio driver functions of non-iommu devices.
> >
> > Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
> > Signed-off-by: Jerin Jacob <jerinj@marvell.com>
>
> NAK
> Having kernel drivers  not in upstream kernel is a long term
> maintenance and security risk. Please work with upstream kernel
> developers to get this merged there.
>

There is security issue in attaching DPDK PF driver and netdev bind to VF.
So this scheme is not upsteamble to Linux kernel. Since rte_flow had VF
action. We need this scheme to support VF action with VFIO. So, Out of tree
is the only way as it is DPDK specific feature. Already sent patches to
Linux kernel, it make sense to not accept this in upstream.  We are already
exposing such features through igb-uio for non VFIO device. IMO, there
should not be any disparity between igb-uio and VFIO in DPDK.

If we are against out of tree module, let's remove igb-uio as well. We
can't have different treatment for similar issues.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dpdk-dev] [PATCH v1 1/1] kernel/linux: introduce vfio_pf kernel module
  2019-10-08 15:28   ` Jerin Jacob
@ 2019-10-09 23:28     ` Stephen Hemminger
  2019-10-10  6:02       ` Jerin Jacob
  0 siblings, 1 reply; 12+ messages in thread
From: Stephen Hemminger @ 2019-10-09 23:28 UTC (permalink / raw)
  To: Jerin Jacob; +Cc: Vamsi Attunuru, dpdk-dev, thomas, jerinj

On Tue, 8 Oct 2019 20:58:27 +0530
Jerin Jacob <jerinjacobk@gmail.com> wrote:

> On Tue, 8 Oct, 2019, 8:42 PM Stephen Hemminger, <stephen@networkplumber.org>
> wrote:
> 
> > On Fri, 6 Sep 2019 14:42:30 +0530
> > <vattunuru@marvell.com> wrote:
> >  
> > > From: Vamsi Attunuru <vattunuru@marvell.com>
> > >
> > > The DPDK use case such as VF representer or OVS offload etc
> > > would call for PF and VF PCIe devices to bind vfio-pci
> > > module to enable IOMMU protection.
> > >
> > > In addition to vSwitch use case, unlike, other PCI class of
> > > devices, Network class of PCIe devices would have additional
> > > responsibility on the PF devices such as promiscuous mode support
> > > etc.
> > >
> > > The above use cases demand VFIO needs bound to PF and its
> > > VF devices. This is use case is not supported in Linux kernel,
> > > due to a security issue where it is possible to have
> > > DoS in case if VF attached to guest over vfio-pci and netdev
> > > kernel driver runs on it and which something VF representer
> > > would like to enable it.
> > >
> > > Since we can not differentiate, the vfio-pci bounded VF devices
> > > runs DPDK application or netdev driver in guest, we can not
> > > introduce any scheme to fix DoS case and therefore not have
> > > proper support of this in the upstream kernel.
> > >
> > > The igb_uio enables such PF and VF binding support for
> > > non-iommu devices to make VF representer or OVS offload
> > > run on non-iommu devices with DoS vulnerability for netdev driver
> > > as VF.
> > >
> > > This kernel module, facilitate to enable SRIOV on PF devices,
> > > therefore, to run both PF and VF devices in VFIO mode knowing
> > > its impacts like igb_uio driver functions of non-iommu devices.
> > >
> > > Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
> > > Signed-off-by: Jerin Jacob <jerinj@marvell.com>  
> >
> > NAK
> > Having kernel drivers  not in upstream kernel is a long term
> > maintenance and security risk. Please work with upstream kernel
> > developers to get this merged there.
> >  
> 
> There is security issue in attaching DPDK PF driver and netdev bind to VF.
> So this scheme is not upsteamble to Linux kernel. Since rte_flow had VF
> action. We need this scheme to support VF action with VFIO. So, Out of tree
> is the only way as it is DPDK specific feature. Already sent patches to
> Linux kernel, it make sense to not accept this in upstream.  We are already
> exposing such features through igb-uio for non VFIO device. IMO, there
> should not be any disparity between igb-uio and VFIO in DPDK.
> 
> If we are against out of tree module, let's remove igb-uio as well. We
> can't have different treatment for similar issues.

You are right igb-uio is legacy and only exists for users unwilling to
change (such as old kernels which don't support vfio noiommu mode).
Eventually it should go as well. And the same for KNI.

Understand the pain, but this feels like a child who gets a no
answer from one parent and therefore thinks they can ask the other parent
and get a yes.

Who is the person in the upstream kernel community that needs to understand
what is needed. Maybe another "I know what I am doing" kernel flag is needed
like no-iommu mode has.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dpdk-dev] [PATCH v1 1/1] kernel/linux: introduce vfio_pf kernel module
  2019-10-09 23:28     ` Stephen Hemminger
@ 2019-10-10  6:02       ` Jerin Jacob
  2019-10-13  7:20         ` Jerin Jacob
  0 siblings, 1 reply; 12+ messages in thread
From: Jerin Jacob @ 2019-10-10  6:02 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Vamsi Attunuru, dpdk-dev, thomas, jerinj

On Thu, 10 Oct, 2019, 4:58 AM Stephen Hemminger, <stephen@networkplumber.org>
wrote:

> On Tue, 8 Oct 2019 20:58:27 +0530
> Jerin Jacob <jerinjacobk@gmail.com> wrote:
>
> > On Tue, 8 Oct, 2019, 8:42 PM Stephen Hemminger, <
> stephen@networkplumber.org>
> > wrote:
> >
> > > On Fri, 6 Sep 2019 14:42:30 +0530
> > > <vattunuru@marvell.com> wrote:
> > >
> > > > From: Vamsi Attunuru <vattunuru@marvell.com>
> > > >
> > > > The DPDK use case such as VF representer or OVS offload etc
> > > > would call for PF and VF PCIe devices to bind vfio-pci
> > > > module to enable IOMMU protection.
> > > >
> > > > In addition to vSwitch use case, unlike, other PCI class of
> > > > devices, Network class of PCIe devices would have additional
> > > > responsibility on the PF devices such as promiscuous mode support
> > > > etc.
> > > >
> > > > The above use cases demand VFIO needs bound to PF and its
> > > > VF devices. This is use case is not supported in Linux kernel,
> > > > due to a security issue where it is possible to have
> > > > DoS in case if VF attached to guest over vfio-pci and netdev
> > > > kernel driver runs on it and which something VF representer
> > > > would like to enable it.
> > > >
> > > > Since we can not differentiate, the vfio-pci bounded VF devices
> > > > runs DPDK application or netdev driver in guest, we can not
> > > > introduce any scheme to fix DoS case and therefore not have
> > > > proper support of this in the upstream kernel.
> > > >
> > > > The igb_uio enables such PF and VF binding support for
> > > > non-iommu devices to make VF representer or OVS offload
> > > > run on non-iommu devices with DoS vulnerability for netdev driver
> > > > as VF.
> > > >
> > > > This kernel module, facilitate to enable SRIOV on PF devices,
> > > > therefore, to run both PF and VF devices in VFIO mode knowing
> > > > its impacts like igb_uio driver functions of non-iommu devices.
> > > >
> > > > Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
> > > > Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> > >
> > > NAK
> > > Having kernel drivers  not in upstream kernel is a long term
> > > maintenance and security risk. Please work with upstream kernel
> > > developers to get this merged there.
> > >
> >
> > There is security issue in attaching DPDK PF driver and netdev bind to
> VF.
> > So this scheme is not upsteamble to Linux kernel. Since rte_flow had VF
> > action. We need this scheme to support VF action with VFIO. So, Out of
> tree
> > is the only way as it is DPDK specific feature. Already sent patches to
> > Linux kernel, it make sense to not accept this in upstream.  We are
> already
> > exposing such features through igb-uio for non VFIO device. IMO, there
> > should not be any disparity between igb-uio and VFIO in DPDK.
> >
> > If we are against out of tree module, let's remove igb-uio as well. We
> > can't have different treatment for similar issues.
>
> You are right igb-uio is legacy and only exists for users unwilling to
> change (such as old kernels which don't support vfio noiommu mode).
>

It is NOT completely correct. This specific use case (allowing PF to bind
DPDK and VF to have netdev in non VFIO mode allowed by igb-uio) now. So all
non upsteamble features are going through igb-uio. That's is the main
reason why igb-uio exist.



Eventually it should go as well. And the same for KNI.
>


Nothing works without timeline. If that's the plan then decide the
timeline.Please...



> Understand the pain, but this feels like a child who gets a no
> answer from one parent and therefore thinks they can ask the other parent
> and get a yes.
>
> Who is the person in the upstream kernel community that needs to understand
> what is needed. Maybe another "I know what I am doing" kernel flag is
> needed
> like no-iommu mode has.
>

no-iommu case is different where we cannot screw Linux netdev driver, you
can create a damage to your self that's an acceptable compromise.

In this case, when DPDK PF bound application dies then it will impact
netdev VF driver as gets stalled and there is a security issues to VF
netdev driver that DPDK PF can intersect the netdev VF mailbox message.

So this case is different where from Kernel PoV there is damange to netdev
VF so this can not be accepted in Linux.

One option is to add this piece of code igb-uio instead to adding new
driver. What do you say?

It is the problem for all non bifurcated drivers in DPDK not specific to
Marvell.

The workaround is to use igb-uio with non VFIO.
We can not support UIO through performance reason hence we need a solution
that works for VFIO due to HW accelerated mempool architecture (applicable
for all NPU)

We can not have VFIO vs UIO specific features disparity in DPDK. Please
have same treatment. Either remove igb-uio hacks from Dpdk or enable non
upsteamble features through igb-uio.So that there is no disparity for a
specific use case / vendor.

>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dpdk-dev] [PATCH v1 1/1] kernel/linux: introduce vfio_pf kernel module
  2019-10-10  6:02       ` Jerin Jacob
@ 2019-10-13  7:20         ` Jerin Jacob
  2019-10-16 11:37           ` Jerin Jacob
  0 siblings, 1 reply; 12+ messages in thread
From: Jerin Jacob @ 2019-10-13  7:20 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Vamsi Attunuru, dpdk-dev, Thomas Monjalon, Jerin Jacob

On Thu, Oct 10, 2019 at 11:32 AM Jerin Jacob <jerinjacobk@gmail.com> wrote:
>
>
>
> On Thu, 10 Oct, 2019, 4:58 AM Stephen Hemminger, <stephen@networkplumber.org> wrote:
>>
>> On Tue, 8 Oct 2019 20:58:27 +0530
>> Jerin Jacob <jerinjacobk@gmail.com> wrote:
>>
>> > On Tue, 8 Oct, 2019, 8:42 PM Stephen Hemminger, <stephen@networkplumber.org>
>> > wrote:
>> >
>> > > On Fri, 6 Sep 2019 14:42:30 +0530
>> > > <vattunuru@marvell.com> wrote:
>> > >
>> > > > From: Vamsi Attunuru <vattunuru@marvell.com>
>> > > >
>> > > > The DPDK use case such as VF representer or OVS offload etc
>> > > > would call for PF and VF PCIe devices to bind vfio-pci
>> > > > module to enable IOMMU protection.
>> > > >
>> > > > In addition to vSwitch use case, unlike, other PCI class of
>> > > > devices, Network class of PCIe devices would have additional
>> > > > responsibility on the PF devices such as promiscuous mode support
>> > > > etc.
>> > > >
>> > > > The above use cases demand VFIO needs bound to PF and its
>> > > > VF devices. This is use case is not supported in Linux kernel,
>> > > > due to a security issue where it is possible to have
>> > > > DoS in case if VF attached to guest over vfio-pci and netdev
>> > > > kernel driver runs on it and which something VF representer
>> > > > would like to enable it.
>> > > >
>> > > > Since we can not differentiate, the vfio-pci bounded VF devices
>> > > > runs DPDK application or netdev driver in guest, we can not
>> > > > introduce any scheme to fix DoS case and therefore not have
>> > > > proper support of this in the upstream kernel.
>> > > >
>> > > > The igb_uio enables such PF and VF binding support for
>> > > > non-iommu devices to make VF representer or OVS offload
>> > > > run on non-iommu devices with DoS vulnerability for netdev driver
>> > > > as VF.
>> > > >
>> > > > This kernel module, facilitate to enable SRIOV on PF devices,
>> > > > therefore, to run both PF and VF devices in VFIO mode knowing
>> > > > its impacts like igb_uio driver functions of non-iommu devices.
>> > > >
>> > > > Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
>> > > > Signed-off-by: Jerin Jacob <jerinj@marvell.com>
>> > >
>> > > NAK
>> > > Having kernel drivers  not in upstream kernel is a long term
>> > > maintenance and security risk. Please work with upstream kernel
>> > > developers to get this merged there.
>> > >
>> >
>> > There is security issue in attaching DPDK PF driver and netdev bind to VF.
>> > So this scheme is not upsteamble to Linux kernel. Since rte_flow had VF
>> > action. We need this scheme to support VF action with VFIO. So, Out of tree
>> > is the only way as it is DPDK specific feature. Already sent patches to
>> > Linux kernel, it make sense to not accept this in upstream.  We are already
>> > exposing such features through igb-uio for non VFIO device. IMO, there
>> > should not be any disparity between igb-uio and VFIO in DPDK.
>> >
>> > If we are against out of tree module, let's remove igb-uio as well. We
>> > can't have different treatment for similar issues.
>>
>> You are right igb-uio is legacy and only exists for users unwilling to
>> change (such as old kernels which don't support vfio noiommu mode).
>
>
> It is NOT completely correct. This specific use case (allowing PF to bind DPDK and VF to have netdev in non VFIO mode allowed by igb-uio) now. So all non upsteamble features are going through igb-uio. That's is the main reason why igb-uio exist.
>
>
>
>> Eventually it should go as well. And the same for KNI.
>
>
>
> Nothing works without timeline. If that's the plan then decide the timeline.Please...
>
>
>>
>> Understand the pain, but this feels like a child who gets a no
>> answer from one parent and therefore thinks they can ask the other parent
>> and get a yes.
>>
>> Who is the person in the upstream kernel community that needs to understand
>> what is needed. Maybe another "I know what I am doing" kernel flag is needed
>> like no-iommu mode has.
>
>
> no-iommu case is different where we cannot screw Linux netdev driver, you can create a damage to your self that's an acceptable compromise.
>
> In this case, when DPDK PF bound application dies then it will impact netdev VF driver as gets stalled and there is a security issues to VF netdev driver that DPDK PF can intersect the netdev VF mailbox message.
>
> So this case is different where from Kernel PoV there is damange to netdev VF so this can not be accepted in Linux.
>
> One option is to add this piece of code igb-uio instead to adding new driver. What do you say?
>
> It is the problem for all non bifurcated drivers in DPDK not specific to Marvell.
>
> The workaround is to use igb-uio with non VFIO.
> We can not support UIO through performance reason hence we need a solution that works for VFIO due to HW accelerated mempool architecture (applicable for all NPU)
>
> We can not have VFIO vs UIO specific features disparity in DPDK. Please have same treatment. Either remove igb-uio hacks from Dpdk or enable non upsteamble features through igb-uio.So that there is no disparity for a specific use case / vendor.

Andrew Rybchenko already acked this patch and I provided enough data
on the need for this patch.
I hope there is no more confusion about the need for this patch.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dpdk-dev] [PATCH v1 1/1] kernel/linux: introduce vfio_pf kernel module
  2019-10-13  7:20         ` Jerin Jacob
@ 2019-10-16 11:37           ` Jerin Jacob
  0 siblings, 0 replies; 12+ messages in thread
From: Jerin Jacob @ 2019-10-16 11:37 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Vamsi Attunuru, dpdk-dev, Thomas Monjalon, Jerin Jacob

> >
> >
> > no-iommu case is different where we cannot screw Linux netdev driver, you can create a damage to your self that's an acceptable compromise.
> >
> > In this case, when DPDK PF bound application dies then it will impact netdev VF driver as gets stalled and there is a security issues to VF netdev driver that DPDK PF can intersect the netdev VF mailbox message.
> >
> > So this case is different where from Kernel PoV there is damange to netdev VF so this can not be accepted in Linux.
> >
> > One option is to add this piece of code igb-uio instead to adding new driver. What do you say?
> >
> > It is the problem for all non bifurcated drivers in DPDK not specific to Marvell.
> >
> > The workaround is to use igb-uio with non VFIO.
> > We can not support UIO through performance reason hence we need a solution that works for VFIO due to HW accelerated mempool architecture (applicable for all NPU)
> >
> > We can not have VFIO vs UIO specific features disparity in DPDK. Please have same treatment. Either remove igb-uio hacks from Dpdk or enable non upsteamble features through igb-uio.So that there is no disparity for a specific use case / vendor.
>
> Andrew Rybchenko already acked this patch and I provided enough data
> on the need for this patch.
> I hope there is no more confusion about the need for this patch.


I assume no more issue to merge this patch for RC1. If yes, please discuss.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, back to index

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-06  9:12 [dpdk-dev] [PATCH v1 1/1] kernel/linux: introduce vfio_pf kernel module vattunuru
2019-09-06  9:45 ` Thomas Monjalon
2019-09-06 13:27   ` Jerin Jacob Kollanukkaran
2019-09-25  4:06     ` Vamsi Krishna Attunuru
2019-09-25  7:18       ` Andrew Rybchenko
2019-10-08  5:07     ` Vamsi Krishna Attunuru
2019-10-08 15:12 ` Stephen Hemminger
2019-10-08 15:28   ` Jerin Jacob
2019-10-09 23:28     ` Stephen Hemminger
2019-10-10  6:02       ` Jerin Jacob
2019-10-13  7:20         ` Jerin Jacob
2019-10-16 11:37           ` Jerin Jacob

DPDK patches and discussions

Archives are clonable:
	git clone --mirror http://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ http://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev


Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/ public-inbox