DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH v2 0/3] Enable uio_pci_generic support
@ 2015-02-19 17:08 Zhou Danny
  2015-02-19 17:08 ` [dpdk-dev] [PATCH v2 1/3] eal: enable " Zhou Danny
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Zhou Danny @ 2015-02-19 17:08 UTC (permalink / raw)
  To: dev

v2 changes:
- Change variable name of kernel driver with precise comment
- Fix a union definition error in v1 patchset 
- Move redefined macro IORESOURCE_MEM to rte_pci.h with comment

Linux kernel provides UIO as well as VFIO mechanism to support writing user
space device driver. Comparing to UIO which is available since 2.6.32 kernel,
the VFIO is introduced into kernel since version 3.6.0 with better interrupt
and memory protection (build on top of Intel VT-d technology) supports.
Basically, UIO and VFIO do two common things below:
1) Map PCIe device's I/O memory space to user space driver
2) Support device interrupt notification mechanism that notifies user space
   driver/application when a device interrupt triggers.
 
To run an DPDK application and make use of VFIO, two in_kernel modules
vfio and vfio-pci module must be loaded. To use UIO, a DPDK kernel
module igb_uio, which was there since DPDK is invented, must be loaded to
attach to in_kernel uio module. As an solution to deprecate igb_uio,
this patch serials leverage the uio_pci_generic in_kernel module to support
DPDK user space PMD in a generic fashion (similar to how VFIO works), to
remove user space DPDK dependency on GPL code igb_uio in kernel.
 
Example to bind Network Ports to uio_pci_generic:
        modprobe uio
        modprobe uio_pci_generic
        /* to bind device 08:00.0, to the uio_pci_generic driver */
        ./tools/dpdk_nic_bind.py -b uio_pci_generic 08:00.0
 
Note: this patch set does not remove igb_uio support due to igb_uio supports
creating maximum number of SR-IOV VFs (Virtual Functions) by using max_vfs
kernel parameter on older kernels (kernel 3.7.x and below).
Specifically, igb_uio explicitly calls pci_enable_sriov() to create VFs, while
it is not invoked in either uio or uio_pci_generic kernel modules. On kernel 3.8.x
and above, user can use the standard sysfs to enable VFs. For examples:
 
#echo $num_vf_enabled > /sys/class/net/$dev/device/sriov_numvfs  // enable VFs
#echo 0 > /sys/class/net/$dev/device/sriov_numvfs                // disable VFs

Danny Zhou (3):
  eal: enable uio_pci_generic support
  eal: add interrupt enable/disable routines for uio_pci_generic
  tools: enable binding NIC device to uio_pci_generic

 lib/librte_eal/common/include/rte_pci.h            |   4 +
 lib/librte_eal/linuxapp/eal/eal_interrupts.c       |  68 +++++--
 lib/librte_eal/linuxapp/eal/eal_pci.c              |   4 +-
 lib/librte_eal/linuxapp/eal/eal_pci_uio.c          | 204 +++++++++++----------
 .../linuxapp/eal/include/exec-env/rte_interrupts.h |   8 +-
 tools/dpdk_nic_bind.py                             |   2 +-
 6 files changed, 175 insertions(+), 115 deletions(-)

-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [dpdk-dev] [PATCH v2 1/3] eal: enable uio_pci_generic support
  2015-02-19 17:08 [dpdk-dev] [PATCH v2 0/3] Enable uio_pci_generic support Zhou Danny
@ 2015-02-19 17:08 ` Zhou Danny
  2015-02-20  9:01   ` Thomas Monjalon
  2015-02-19 17:08 ` [dpdk-dev] [PATCH v2 2/3] eal: add interrupt enable/disable routines for uio_pci_generic Zhou Danny
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: Zhou Danny @ 2015-02-19 17:08 UTC (permalink / raw)
  To: dev

v2 changes:
- Change variable name of kernel driver with precise comment.
- Fix a union definition error in v1 patchset.
- Move redefined macro IORESOURCE_MEM to rte_pci.h with comment.

1) Unify procedure to retrieve BAR resource mapping information.
2) Setup bus master bit in NIC's PCIe configuration space for
uio_pci_generic.

Signed-off-by: Danny Zhou <danny.zhou@intel.com>
Tested-by: Qun Wan <qun.wan@intel.com>
---
 lib/librte_eal/common/include/rte_pci.h            |   4 +
 lib/librte_eal/linuxapp/eal/eal_pci.c              |   4 +-
 lib/librte_eal/linuxapp/eal/eal_pci_uio.c          | 204 +++++++++++----------
 .../linuxapp/eal/include/exec-env/rte_interrupts.h |   8 +-
 4 files changed, 122 insertions(+), 98 deletions(-)

diff --git a/lib/librte_eal/common/include/rte_pci.h b/lib/librte_eal/common/include/rte_pci.h
index 66ed793..85f116f 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -104,6 +104,9 @@ extern struct pci_device_list pci_device_list; /**< Global list of PCI devices.
 /** Nb. of values in PCI resource format. */
 #define PCI_RESOURCE_FMT_NVAL 3
 
+/** IO resource type: memory address space */
+#define IORESOURCE_MEM        0x00000200
+
 /**
  * A structure describing a PCI resource.
  */
@@ -148,6 +151,7 @@ struct rte_pci_device {
 	struct rte_pci_id id;                   /**< PCI ID. */
 	struct rte_pci_resource mem_resource[PCI_MAX_RESOURCE];   /**< PCI Memory Resource */
 	struct rte_intr_handle intr_handle;     /**< Interrupt handle */
+	char kernel_driver_name[BUFSIZ];        /**< Kernel driver name */
 	const struct rte_pci_driver *driver;    /**< Associated driver */
 	uint16_t max_vfs;                       /**< sriov enable if not zero */
 	int numa_node;                          /**< NUMA node connection */
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c b/lib/librte_eal/linuxapp/eal/eal_pci.c
index 15db9c4..63bcbce 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -138,8 +138,6 @@ pci_map_resource(void *requested_addr, int fd, off_t offset, size_t size)
 }
 
 /* parse the "resource" sysfs file */
-#define IORESOURCE_MEM  0x00000200
-
 static int
 pci_parse_sysfs_resource(const char *filename, struct rte_pci_device *dev)
 {
@@ -519,7 +517,7 @@ pci_map_device(struct rte_pci_device *dev)
 			return ret;
 	}
 #endif
-	/* map resources for devices that use igb_uio */
+	/* map resources for devices that use uio_pci_generic or igb_uio */
 	if (!mapped) {
 		ret = pci_uio_map_resource(dev);
 		if (ret != 0)
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
index e53f06b..03ac006 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
@@ -36,6 +36,7 @@
 #include <dirent.h>
 #include <sys/stat.h>
 #include <sys/mman.h>
+#include <linux/pci_regs.h>
 
 #include <rte_log.h>
 #include <rte_pci.h>
@@ -47,71 +48,72 @@
 #include "eal_filesystem.h"
 #include "eal_pci_init.h"
 
-static int pci_parse_sysfs_value(const char *filename, uint64_t *val);
+/* driver name of uio_pci_generic kernel module */
+#define UIO_PCI_GENERIC_DRIVER_NAME  "uio_pci_generic"
 
 void *pci_map_addr = NULL;
 
 
 #define OFF_MAX              ((uint64_t)(off_t)-1)
 static int
-pci_uio_get_mappings(const char *devname, struct pci_map maps[], int nb_maps)
+pci_uio_get_mappings(struct rte_pci_device *dev,
+			struct pci_map maps[], int nb_maps)
 {
-	int i;
-	char dirname[PATH_MAX];
+	struct rte_pci_addr *loc = &dev->addr;
+	int i = 0;
 	char filename[PATH_MAX];
-	uint64_t offset, size;
-
-	for (i = 0; i != nb_maps; i++) {
+	unsigned long long start_addr, end_addr, flags;
+	FILE *f;
 
-		/* check if map directory exists */
-		snprintf(dirname, sizeof(dirname),
-			"%s/maps/map%u", devname, i);
+	snprintf(filename, sizeof(filename),
+		SYSFS_PCI_DEVICES "/" PCI_PRI_FMT "/resource",
+		loc->domain, loc->bus, loc->devid, loc->function);
 
-		if (access(dirname, F_OK) != 0)
-			break;
+	f = fopen(filename, "r");
+	if (f == NULL) {
+		RTE_LOG(ERR, EAL,
+		"%s(): cannot open sysfs %s\n",
+		__func__, filename);
+		return -1;
+	}
 
-		/* get mapping offset */
-		snprintf(filename, sizeof(filename),
-			"%s/offset", dirname);
-		if (pci_parse_sysfs_value(filename, &offset) < 0) {
-			RTE_LOG(ERR, EAL,
-				"%s(): cannot parse offset of %s\n",
-				__func__, dirname);
-			return -1;
+	while (fscanf(f, "%llx %llx %llx", &start_addr,
+			&end_addr, &flags) == 3 && i < nb_maps) {
+		if (flags & IORESOURCE_MEM) {
+			maps[i].offset = 0x0;
+			maps[i].size = end_addr - start_addr + 1;
+			maps[i].phaddr = start_addr;
+			i++;
 		}
+	}
+	fclose(f);
 
-		/* get mapping size */
-		snprintf(filename, sizeof(filename),
-			"%s/size", dirname);
-		if (pci_parse_sysfs_value(filename, &size) < 0) {
-			RTE_LOG(ERR, EAL,
-				"%s(): cannot parse size of %s\n",
-				__func__, dirname);
-			return -1;
-		}
+	return i;
+}
 
-		/* get mapping physical address */
-		snprintf(filename, sizeof(filename),
-			"%s/addr", dirname);
-		if (pci_parse_sysfs_value(filename, &maps[i].phaddr) < 0) {
-			RTE_LOG(ERR, EAL,
-				"%s(): cannot parse addr of %s\n",
-				__func__, dirname);
-			return -1;
-		}
+static int
+pci_uio_set_bus_master(int dev_fd)
+{
+	uint16_t reg;
+	int ret;
 
-		if ((offset > OFF_MAX) || (size > SIZE_MAX)) {
-			RTE_LOG(ERR, EAL,
-				"%s(): offset/size exceed system max value\n",
-				__func__);
-			return -1;
-		}
+	ret = pread(dev_fd, &reg, sizeof(reg), PCI_COMMAND);
+	if (ret != sizeof(reg)) {
+		RTE_LOG(ERR, EAL,
+			"Cannot read command from PCI config space!\n");
+		return -1;
+	}
+
+	reg |= PCI_COMMAND_MASTER;
 
-		maps[i].offset = offset;
-		maps[i].size = size;
+	ret = pwrite(dev_fd, &reg, sizeof(reg), PCI_COMMAND);
+	if (ret != sizeof(reg)) {
+		RTE_LOG(ERR, EAL,
+			"Cannot write command to PCI config space!\n");
+		return -1;
 	}
 
-	return i;
+	return 0;
 }
 
 static int
@@ -213,6 +215,10 @@ pci_get_uio_dev(struct rte_pci_device *dev, char *dstbuf,
 	struct dirent *e;
 	DIR *dir;
 	char dirname[PATH_MAX];
+	FILE *f;
+	char fullpath[PATH_MAX];
+	char buf[BUFSIZ];
+	char *s;
 
 	/* depending on kernel version, uio can be located in uio/uioX
 	 * or uio:uioX */
@@ -268,6 +274,30 @@ pci_get_uio_dev(struct rte_pci_device *dev, char *dstbuf,
 	if (e == NULL)
 		return -1;
 
+	/* remember driver name of uio device */
+	snprintf(fullpath, sizeof(fullpath),
+			SYSFS_PCI_DEVICES "/" PCI_PRI_FMT "/uio/%s/name" ,
+			loc->domain, loc->bus, loc->devid,
+			loc->function, e->d_name);
+
+	f = fopen(fullpath, "r");
+	if (f == NULL) {
+		RTE_LOG(ERR, EAL,
+			"%s(): cannot open sysfs to get driver name\n",
+			__func__);
+		return -1;
+	}
+	s = fgets(buf, sizeof(buf), f);
+	if (s == NULL) {
+		RTE_LOG(ERR, EAL,
+			"%s(): cannot read sysfs to get driver name\n",
+			__func__);
+		fclose(f);
+		return -1;
+	}
+	strncpy(dev->kernel_driver_name, buf, sizeof(buf));
+	fclose(f);
+
 	/* create uio device if we've been asked to */
 	if (internal_config.create_uio_dev &&
 			pci_mknod_uio_dev(dstbuf, uio_num) < 0)
@@ -282,6 +312,7 @@ pci_uio_map_resource(struct rte_pci_device *dev)
 {
 	int i, j;
 	char dirname[PATH_MAX];
+	char cfgname[PATH_MAX];
 	char devname[PATH_MAX]; /* contains the /dev/uioX */
 	void *mapaddr;
 	int uio_num;
@@ -294,6 +325,7 @@ pci_uio_map_resource(struct rte_pci_device *dev)
 	struct pci_map *maps;
 
 	dev->intr_handle.fd = -1;
+	dev->intr_handle.uio_cfg_fd = -1;
 	dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN;
 
 	/* secondary processes - use already recorded details */
@@ -318,6 +350,33 @@ pci_uio_map_resource(struct rte_pci_device *dev)
 	}
 	dev->intr_handle.type = RTE_INTR_HANDLE_UIO;
 
+	snprintf(cfgname, sizeof(cfgname),
+			"/sys/class/uio/uio%u/device/config", uio_num);
+	dev->intr_handle.uio_cfg_fd = open(cfgname, O_RDWR);
+	if (dev->intr_handle.uio_cfg_fd < 0) {
+		RTE_LOG(ERR, EAL, "Cannot open %s: %s\n",
+			cfgname, strerror(errno));
+		return -1;
+	}
+
+	/* handle Misc. stuff for uio_pci_generic  */
+	if (strncmp(dev->kernel_driver_name, UIO_PCI_GENERIC_DRIVER_NAME,
+			strlen(UIO_PCI_GENERIC_DRIVER_NAME)) == 0) {
+		/* update devname for uio_pci_generic  */
+		snprintf(devname, sizeof(devname),
+			SYSFS_PCI_DEVICES "/" PCI_PRI_FMT "/resource%d",
+			loc->domain, loc->bus, loc->devid, loc->function, 0);
+
+		/* set bus master that is not done by uio_pci_generic */
+		if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+			if (pci_uio_set_bus_master(dev->intr_handle.uio_cfg_fd)) {
+				RTE_LOG(ERR, EAL,
+					"Cannot set up bus mastering!\n");
+				return -1;
+			}
+		}
+	}
+
 	/* allocate the mapping details for secondary processes*/
 	uio_res = rte_zmalloc("UIO_RES", sizeof(*uio_res), 0);
 	if (uio_res == NULL) {
@@ -330,13 +389,12 @@ pci_uio_map_resource(struct rte_pci_device *dev)
 	memcpy(&uio_res->pci_addr, &dev->addr, sizeof(uio_res->pci_addr));
 
 	/* collect info about device mappings */
-	nb_maps = pci_uio_get_mappings(dirname, uio_res->maps,
-				       RTE_DIM(uio_res->maps));
+	nb_maps = pci_uio_get_mappings(dev, uio_res->maps,
+					RTE_DIM(uio_res->maps));
 	if (nb_maps < 0) {
 		rte_free(uio_res);
 		return nb_maps;
 	}
-
 	uio_res->nb_maps = nb_maps;
 
 	/* Map all BARs */
@@ -374,16 +432,11 @@ pci_uio_map_resource(struct rte_pci_device *dev)
 			if (maps[j].addr != NULL)
 				fail = 1;
 			else {
-				/* try mapping somewhere close to the end of hugepages */
-				if (pci_map_addr == NULL)
-					pci_map_addr = pci_find_max_end_va();
-
-				mapaddr = pci_map_resource(pci_map_addr, fd, (off_t)offset,
+				mapaddr = pci_map_resource(NULL, fd,
+						(off_t)offset,
 						(size_t)maps[j].size);
-				if (mapaddr == MAP_FAILED)
+				if (mapaddr == NULL)
 					fail = 1;
-
-				pci_map_addr = RTE_PTR_ADD(mapaddr, (size_t) maps[j].size);
 			}
 
 			if (fail) {
@@ -403,38 +456,3 @@ pci_uio_map_resource(struct rte_pci_device *dev)
 
 	return 0;
 }
-
-/*
- * parse a sysfs file containing one integer value
- * different to the eal version, as it needs to work with 64-bit values
- */
-static int
-pci_parse_sysfs_value(const char *filename, uint64_t *val)
-{
-	FILE *f;
-	char buf[BUFSIZ];
-	char *end = NULL;
-
-	f = fopen(filename, "r");
-	if (f == NULL) {
-		RTE_LOG(ERR, EAL, "%s(): cannot open sysfs value %s\n",
-				__func__, filename);
-		return -1;
-	}
-
-	if (fgets(buf, sizeof(buf), f) == NULL) {
-		RTE_LOG(ERR, EAL, "%s(): cannot read sysfs value %s\n",
-				__func__, filename);
-		fclose(f);
-		return -1;
-	}
-	*val = strtoull(buf, &end, 0);
-	if ((buf[0] == '\0') || (end == NULL) || (*end != '\n')) {
-		RTE_LOG(ERR, EAL, "%s(): cannot parse sysfs value %s\n",
-				__func__, filename);
-		fclose(f);
-		return -1;
-	}
-	fclose(f);
-	return 0;
-}
diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
index 23eafd9..6a159c7 100644
--- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
@@ -50,8 +50,12 @@ enum rte_intr_handle_type {
 
 /** Handle for interrupts. */
 struct rte_intr_handle {
-	int vfio_dev_fd;                 /**< VFIO device file descriptor */
-	int fd;                          /**< file descriptor */
+	union {
+		int vfio_dev_fd;  /**< VFIO device file descriptor */
+		int uio_cfg_fd;  /**< UIO config file descriptor
+					for uio_pci_generic */
+	};
+	int fd;	 /**< interrupt event file descriptor */
 	enum rte_intr_handle_type type;  /**< handle type */
 };
 
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [dpdk-dev] [PATCH v2 2/3] eal: add interrupt enable/disable routines for uio_pci_generic
  2015-02-19 17:08 [dpdk-dev] [PATCH v2 0/3] Enable uio_pci_generic support Zhou Danny
  2015-02-19 17:08 ` [dpdk-dev] [PATCH v2 1/3] eal: enable " Zhou Danny
@ 2015-02-19 17:08 ` Zhou Danny
  2015-02-19 17:08 ` [dpdk-dev] [PATCH v2 3/3] tools: enable binding NIC device to uio_pci_generic Zhou Danny
  2015-02-20 16:59 ` [dpdk-dev] [PATCH v3 0/3] enable uio_pci_generic support Bruce Richardson
  3 siblings, 0 replies; 13+ messages in thread
From: Zhou Danny @ 2015-02-19 17:08 UTC (permalink / raw)
  To: dev

enable/disable interrupt by manipulating a control bit of command
register on NIC's PCIe configuration space.

Signed-off-by: Danny Zhou <danny.zhou@intel.com>
Tested-by: Qun Wan <qun.wan@intel.com>
---
 lib/librte_eal/linuxapp/eal/eal_interrupts.c | 68 +++++++++++++++++++++-------
 1 file changed, 52 insertions(+), 16 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
index dc2668a..8c5b834 100644
--- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
@@ -361,6 +361,54 @@ vfio_disable_msix(struct rte_intr_handle *intr_handle) {
 }
 #endif
 
+static int
+uio_intr_disable(struct rte_intr_handle *intr_handle)
+{
+	unsigned char command_high;
+
+	/* use UIO config file descriptor for uio_pci_generic */
+	if (pread(intr_handle->uio_cfg_fd, &command_high, 1, 5) != 1) {
+		RTE_LOG(ERR, EAL,
+			"Error reading interrupts status for fd %d\n",
+			intr_handle->uio_cfg_fd);
+		return -1;
+	}
+	/* disable interrupts */
+	command_high |= 0x4;
+	if (pwrite(intr_handle->uio_cfg_fd, &command_high, 1, 5) != 1) {
+		RTE_LOG(ERR, EAL,
+			"Error disabling interrupts for fd %d\n",
+			intr_handle->uio_cfg_fd);
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+uio_intr_enable(struct rte_intr_handle *intr_handle)
+{
+	unsigned char command_high;
+
+	/* use UIO config file descriptor for uio_pci_generic */
+	if (pread(intr_handle->uio_cfg_fd, &command_high, 1, 5) != 1) {
+		RTE_LOG(ERR, EAL,
+			"Error reading interrupts status for fd %d\n",
+			intr_handle->uio_cfg_fd);
+		return -1;
+	}
+	/* enable interrupts */
+	command_high &= ~0x4;
+	if (pwrite(intr_handle->uio_cfg_fd, &command_high, 1, 5) != 1) {
+		RTE_LOG(ERR, EAL,
+			"Error enabling interrupts for fd %d\n",
+			intr_handle->uio_cfg_fd);
+		return -1;
+	}
+
+	return 0;
+}
+
 int
 rte_intr_callback_register(struct rte_intr_handle *intr_handle,
 			rte_intr_callback_fn cb, void *cb_arg)
@@ -500,20 +548,14 @@ rte_intr_callback_unregister(struct rte_intr_handle *intr_handle,
 int
 rte_intr_enable(struct rte_intr_handle *intr_handle)
 {
-	const int value = 1;
-
-	if (!intr_handle || intr_handle->fd < 0)
+	if (!intr_handle || intr_handle->fd < 0 || intr_handle->uio_cfg_fd < 0)
 		return -1;
 
 	switch (intr_handle->type){
 	/* write to the uio fd to enable the interrupt */
 	case RTE_INTR_HANDLE_UIO:
-		if (write(intr_handle->fd, &value, sizeof(value)) < 0) {
-			RTE_LOG(ERR, EAL,
-				"Error enabling interrupts for fd %d\n",
-							intr_handle->fd);
+		if (uio_intr_enable(intr_handle))
 			return -1;
-		}
 		break;
 	/* not used at this moment */
 	case RTE_INTR_HANDLE_ALARM:
@@ -546,20 +588,14 @@ rte_intr_enable(struct rte_intr_handle *intr_handle)
 int
 rte_intr_disable(struct rte_intr_handle *intr_handle)
 {
-	const int value = 0;
-
-	if (!intr_handle || intr_handle->fd < 0)
+	if (!intr_handle || intr_handle->fd < 0 || intr_handle->uio_cfg_fd < 0)
 		return -1;
 
 	switch (intr_handle->type){
 	/* write to the uio fd to disable the interrupt */
 	case RTE_INTR_HANDLE_UIO:
-		if (write(intr_handle->fd, &value, sizeof(value)) < 0){
-			RTE_LOG(ERR, EAL,
-				"Error disabling interrupts for fd %d\n",
-							intr_handle->fd);
+		if (uio_intr_disable(intr_handle))
 			return -1;
-		}
 		break;
 	/* not used at this moment */
 	case RTE_INTR_HANDLE_ALARM:
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [dpdk-dev] [PATCH v2 3/3] tools: enable binding NIC device to uio_pci_generic
  2015-02-19 17:08 [dpdk-dev] [PATCH v2 0/3] Enable uio_pci_generic support Zhou Danny
  2015-02-19 17:08 ` [dpdk-dev] [PATCH v2 1/3] eal: enable " Zhou Danny
  2015-02-19 17:08 ` [dpdk-dev] [PATCH v2 2/3] eal: add interrupt enable/disable routines for uio_pci_generic Zhou Danny
@ 2015-02-19 17:08 ` Zhou Danny
  2015-02-20 16:59 ` [dpdk-dev] [PATCH v3 0/3] enable uio_pci_generic support Bruce Richardson
  3 siblings, 0 replies; 13+ messages in thread
From: Zhou Danny @ 2015-02-19 17:08 UTC (permalink / raw)
  To: dev

Add uio_pci_generic to the list of supported kernel drivers.

Signed-off-by: Danny Zhou <danny.zhou@intel.com>
Tested-by: Qun Wan <qun.wan@intel.com>
---
 tools/dpdk_nic_bind.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/dpdk_nic_bind.py b/tools/dpdk_nic_bind.py
index 812b6a1..2483056 100755
--- a/tools/dpdk_nic_bind.py
+++ b/tools/dpdk_nic_bind.py
@@ -43,7 +43,7 @@ ETHERNET_CLASS = "0200"
 # Each device within this is itself a dictionary of device properties
 devices = {}
 # list of supported DPDK drivers
-dpdk_drivers = [ "igb_uio", "vfio-pci" ]
+dpdk_drivers = [ "igb_uio", "vfio-pci", "uio_pci_generic" ]
 
 # command-line arg flags
 b_flag = None
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/3] eal: enable uio_pci_generic support
  2015-02-19 17:08 ` [dpdk-dev] [PATCH v2 1/3] eal: enable " Zhou Danny
@ 2015-02-20  9:01   ` Thomas Monjalon
  2015-02-20 10:15     ` Bruce Richardson
  0 siblings, 1 reply; 13+ messages in thread
From: Thomas Monjalon @ 2015-02-20  9:01 UTC (permalink / raw)
  To: Zhou Danny; +Cc: dev

Hi Danny,

2015-02-20 01:08, Zhou Danny:
> @@ -148,6 +151,7 @@ struct rte_pci_device {
>  	struct rte_pci_id id;                   /**< PCI ID. */
>  	struct rte_pci_resource mem_resource[PCI_MAX_RESOURCE];   /**< PCI Memory Resource */
>  	struct rte_intr_handle intr_handle;     /**< Interrupt handle */
> +	char kernel_driver_name[BUFSIZ];        /**< Kernel driver name */

The comment is not very helpful.
What about something like "Kernel driver to map ressources in userspace"?

You are introducing a new field without filling it for UIO and VFIO.

Testuya and Michael are working on the same thing in hotplug patchset:
	http://dpdk.org/dev/patchwork/patch/3520/
Please help to choose the best approach.

>  	const struct rte_pci_driver *driver;    /**< Associated driver */

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/3] eal: enable uio_pci_generic support
  2015-02-20  9:01   ` Thomas Monjalon
@ 2015-02-20 10:15     ` Bruce Richardson
  0 siblings, 0 replies; 13+ messages in thread
From: Bruce Richardson @ 2015-02-20 10:15 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

On Fri, Feb 20, 2015 at 10:01:10AM +0100, Thomas Monjalon wrote:
> Hi Danny,
> 
> 2015-02-20 01:08, Zhou Danny:
> > @@ -148,6 +151,7 @@ struct rte_pci_device {
> >  	struct rte_pci_id id;                   /**< PCI ID. */
> >  	struct rte_pci_resource mem_resource[PCI_MAX_RESOURCE];   /**< PCI Memory Resource */
> >  	struct rte_intr_handle intr_handle;     /**< Interrupt handle */
> > +	char kernel_driver_name[BUFSIZ];        /**< Kernel driver name */
> 
> The comment is not very helpful.
> What about something like "Kernel driver to map ressources in userspace"?
> 
> You are introducing a new field without filling it for UIO and VFIO.
> 
> Testuya and Michael are working on the same thing in hotplug patchset:
> 	http://dpdk.org/dev/patchwork/patch/3520/
> Please help to choose the best approach.
> 
> >  	const struct rte_pci_driver *driver;    /**< Associated driver */
>
Hi Thomas,
since Danny is now off on Chinese New Year break, I'll look into this and see
about sending an updated patch set if I can today.

Regards,
/Bruce

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [dpdk-dev] [PATCH v3 0/3] enable uio_pci_generic support
  2015-02-19 17:08 [dpdk-dev] [PATCH v2 0/3] Enable uio_pci_generic support Zhou Danny
                   ` (2 preceding siblings ...)
  2015-02-19 17:08 ` [dpdk-dev] [PATCH v2 3/3] tools: enable binding NIC device to uio_pci_generic Zhou Danny
@ 2015-02-20 16:59 ` Bruce Richardson
  2015-02-20 16:59   ` [dpdk-dev] [PATCH v3 1/3] eal: " Bruce Richardson
                     ` (3 more replies)
  3 siblings, 4 replies; 13+ messages in thread
From: Bruce Richardson @ 2015-02-20 16:59 UTC (permalink / raw)
  To: dev

v3 changes
- made processing of uio devices identical, irrespective of igb_uio or uio_pci_generic
- removed storage of kernel driver name from dev node as now unneeded.

v2 changes:
- Change variable name of kernel driver with precise comment
- Fix a union definition error in v1 patchset
- Move redefined macro IORESOURCE_MEM to rte_pci.h with comment

Linux kernel provides UIO as well as VFIO mechanism to support writing user
space device driver. Comparing to UIO which is available since 2.6.32 kernel,
the VFIO is introduced into kernel since version 3.6.0 with better interrupt
and memory protection (build on top of Intel VT-d technology) supports.
Basically, UIO and VFIO do two common things below:
1) Map PCIe device's I/O memory space to user space driver
2) Support device interrupt notification mechanism that notifies user space
   driver/application when a device interrupt triggers.

To run an DPDK application and make use of VFIO, two in_kernel modules
vfio and vfio-pci module must be loaded. To use UIO, a DPDK kernel
module igb_uio, which was there since DPDK is invented, must be loaded to
attach to in_kernel uio module. As an solution to deprecate igb_uio,
this patch serials leverage the uio_pci_generic in_kernel module to support
DPDK user space PMD in a generic fashion (similar to how VFIO works), to
remove user space DPDK dependency on GPL code igb_uio in kernel.

Example to bind Network Ports to uio_pci_generic:
           modprobe uio
           modprobe uio_pci_generic
           /* to bind device 08:00.0, to the uio_pci_generic driver */
           ./tools/dpdk_nic_bind.py -b uio_pci_generic 08:00.0

Note: this patch set does not remove igb_uio support due to igb_uio supports
creating maximum number of SR-IOV VFs (Virtual Functions) by using max_vfs
kernel parameter on older kernels (kernel 3.7.x and below).
Specifically, igb_uio explicitly calls pci_enable_sriov() to create VFs, while
it is not invoked in either uio or uio_pci_generic kernel modules. On kernel 3.8.x
and above, user can use the standard sysfs to enable VFs. For examples:

#echo $num_vf_enabled > /sys/class/net/$dev/device/sriov_numvfs  // enable VFs
#echo 0 > /sys/class/net/$dev/device/sriov_numvfs                // disable VFs


Zhou Danny (3):
  eal: enable uio_pci_generic support
  eal: add interrupt enable/disable routines for uio_pci_generic
  tools: enable binding NIC device to uio_pci_generic

 lib/librte_eal/common/include/rte_pci.h            |   3 +
 lib/librte_eal/linuxapp/eal/eal_interrupts.c       |  68 +++++++--
 lib/librte_eal/linuxapp/eal/eal_pci.c              |   4 +-
 lib/librte_eal/linuxapp/eal/eal_pci_uio.c          | 167 ++++++++++-----------
 .../linuxapp/eal/include/exec-env/rte_interrupts.h |   8 +-
 tools/dpdk_nic_bind.py                             |   2 +-
 6 files changed, 144 insertions(+), 108 deletions(-)

-- 
2.1.0

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [dpdk-dev] [PATCH v3 1/3] eal: enable uio_pci_generic support
  2015-02-20 16:59 ` [dpdk-dev] [PATCH v3 0/3] enable uio_pci_generic support Bruce Richardson
@ 2015-02-20 16:59   ` Bruce Richardson
  2015-02-23 15:24     ` David Marchand
  2015-02-20 16:59   ` [dpdk-dev] [PATCH v3 2/3] eal: add interrupt enable/disable routines for uio_pci_generic Bruce Richardson
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: Bruce Richardson @ 2015-02-20 16:59 UTC (permalink / raw)
  To: dev

From: Zhou Danny <danny.zhou@intel.com>

Change the EAL PCI code so that it can work with both the
uio_pci_generic in-tree driver, as well as the igb_uio
DPDK-specific driver.

This involves changes to
1) Modify method of retrieving BAR resource mapping information
2) Mapping using resource files in /sys rather than /dev/uio*
2) Setup bus master bit in NIC's PCIe configuration space for
uio_pci_generic.

Signed-off-by: Danny Zhou <danny.zhou@intel.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>

---
v3 changes:
- Remove the requirement for the kernel_driver_name to be
 stored for each device
- Ensure greater commonality of code between uio_generic and
 igb_uio, by always checking if bus mastering is on, and
 always using resource0 for mmap.
- Ensure secondary processes do not map resources unused in primary process

v2 changes:
- Change variable name of kernel driver with precise comment.
- Fix a union definition error in v1 patchset.
- Move redefined macro IORESOURCE_MEM to rte_pci.h with comment.
---
 lib/librte_eal/common/include/rte_pci.h            |   3 +
 lib/librte_eal/linuxapp/eal/eal_pci.c              |   4 +-
 lib/librte_eal/linuxapp/eal/eal_pci_uio.c          | 167 ++++++++++-----------
 .../linuxapp/eal/include/exec-env/rte_interrupts.h |   8 +-
 4 files changed, 91 insertions(+), 91 deletions(-)

diff --git a/lib/librte_eal/common/include/rte_pci.h b/lib/librte_eal/common/include/rte_pci.h
index 66ed793..4301c16 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -104,6 +104,9 @@ extern struct pci_device_list pci_device_list; /**< Global list of PCI devices.
 /** Nb. of values in PCI resource format. */
 #define PCI_RESOURCE_FMT_NVAL 3
 
+/** IO resource type: memory address space */
+#define IORESOURCE_MEM        0x00000200
+
 /**
  * A structure describing a PCI resource.
  */
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c b/lib/librte_eal/linuxapp/eal/eal_pci.c
index 15db9c4..63bcbce 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -138,8 +138,6 @@ pci_map_resource(void *requested_addr, int fd, off_t offset, size_t size)
 }
 
 /* parse the "resource" sysfs file */
-#define IORESOURCE_MEM  0x00000200
-
 static int
 pci_parse_sysfs_resource(const char *filename, struct rte_pci_device *dev)
 {
@@ -519,7 +517,7 @@ pci_map_device(struct rte_pci_device *dev)
 			return ret;
 	}
 #endif
-	/* map resources for devices that use igb_uio */
+	/* map resources for devices that use uio_pci_generic or igb_uio */
 	if (!mapped) {
 		ret = pci_uio_map_resource(dev);
 		if (ret != 0)
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
index 54cce08..2b16fcb 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
@@ -36,6 +36,7 @@
 #include <dirent.h>
 #include <sys/stat.h>
 #include <sys/mman.h>
+#include <linux/pci_regs.h>
 
 #include <rte_log.h>
 #include <rte_pci.h>
@@ -47,71 +48,73 @@
 #include "eal_filesystem.h"
 #include "eal_pci_init.h"
 
-static int pci_parse_sysfs_value(const char *filename, uint64_t *val);
-
 void *pci_map_addr = NULL;
 
 
 #define OFF_MAX              ((uint64_t)(off_t)-1)
 static int
-pci_uio_get_mappings(const char *devname, struct pci_map maps[], int nb_maps)
+pci_uio_get_mappings(struct rte_pci_device *dev,
+			struct pci_map maps[], int nb_maps)
 {
-	int i;
-	char dirname[PATH_MAX];
+	struct rte_pci_addr *loc = &dev->addr;
+	int i = 0;
 	char filename[PATH_MAX];
-	uint64_t offset, size;
+	unsigned long long start_addr, end_addr, flags;
+	FILE *f;
 
-	for (i = 0; i != nb_maps; i++) {
+	snprintf(filename, sizeof(filename),
+		SYSFS_PCI_DEVICES "/" PCI_PRI_FMT "/resource",
+		loc->domain, loc->bus, loc->devid, loc->function);
 
-		/* check if map directory exists */
-		snprintf(dirname, sizeof(dirname),
-			"%s/maps/map%u", devname, i);
-
-		if (access(dirname, F_OK) != 0)
-			break;
+	f = fopen(filename, "r");
+	if (f == NULL) {
+		RTE_LOG(ERR, EAL,
+		"%s(): cannot open sysfs %s\n",
+		__func__, filename);
+		return -1;
+	}
 
-		/* get mapping offset */
-		snprintf(filename, sizeof(filename),
-			"%s/offset", dirname);
-		if (pci_parse_sysfs_value(filename, &offset) < 0) {
-			RTE_LOG(ERR, EAL,
-				"%s(): cannot parse offset of %s\n",
-				__func__, dirname);
-			return -1;
+	while (fscanf(f, "%llx %llx %llx", &start_addr,
+			&end_addr, &flags) == 3 && i < nb_maps) {
+		if (flags & IORESOURCE_MEM) {
+			maps[i].offset = 0x0;
+			maps[i].size = end_addr - start_addr + 1;
+			maps[i].phaddr = start_addr;
+			i++;
 		}
+	}
+	fclose(f);
 
-		/* get mapping size */
-		snprintf(filename, sizeof(filename),
-			"%s/size", dirname);
-		if (pci_parse_sysfs_value(filename, &size) < 0) {
-			RTE_LOG(ERR, EAL,
-				"%s(): cannot parse size of %s\n",
-				__func__, dirname);
-			return -1;
-		}
+	return i;
+}
 
-		/* get mapping physical address */
-		snprintf(filename, sizeof(filename),
-			"%s/addr", dirname);
-		if (pci_parse_sysfs_value(filename, &maps[i].phaddr) < 0) {
-			RTE_LOG(ERR, EAL,
-				"%s(): cannot parse addr of %s\n",
-				__func__, dirname);
-			return -1;
-		}
+static int
+pci_uio_set_bus_master(int dev_fd)
+{
+	uint16_t reg;
+	int ret;
 
-		if ((offset > OFF_MAX) || (size > SIZE_MAX)) {
-			RTE_LOG(ERR, EAL,
-				"%s(): offset/size exceed system max value\n",
-				__func__);
-			return -1;
-		}
+	ret = pread(dev_fd, &reg, sizeof(reg), PCI_COMMAND);
+	if (ret != sizeof(reg)) {
+		RTE_LOG(ERR, EAL,
+			"Cannot read command from PCI config space!\n");
+		return -1;
+	}
+
+	/* return if bus mastering is already on */
+	if (reg & PCI_COMMAND_MASTER)
+		return 0;
+
+	reg |= PCI_COMMAND_MASTER;
 
-		maps[i].offset = offset;
-		maps[i].size = size;
+	ret = pwrite(dev_fd, &reg, sizeof(reg), PCI_COMMAND);
+	if (ret != sizeof(reg)) {
+		RTE_LOG(ERR, EAL,
+			"Cannot write command to PCI config space!\n");
+		return -1;
 	}
 
-	return i;
+	return 0;
 }
 
 static int
@@ -127,6 +130,10 @@ pci_uio_map_secondary(struct rte_pci_device *dev)
 			continue;
 
 		for (i = 0; i != uio_res->nb_maps; i++) {
+			/* ignore mappings unused in primary process */
+			if (uio_res->maps[i].addr == NULL)
+				continue;
+
 			/*
 			 * open devname, to mmap it
 			 */
@@ -282,6 +289,7 @@ pci_uio_map_resource(struct rte_pci_device *dev)
 {
 	int i, j;
 	char dirname[PATH_MAX];
+	char cfgname[PATH_MAX];
 	char devname[PATH_MAX]; /* contains the /dev/uioX */
 	void *mapaddr;
 	int uio_num;
@@ -294,6 +302,7 @@ pci_uio_map_resource(struct rte_pci_device *dev)
 	struct pci_map *maps;
 
 	dev->intr_handle.fd = -1;
+	dev->intr_handle.uio_cfg_fd = -1;
 	dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN;
 
 	/* secondary processes - use already recorded details */
@@ -318,6 +327,28 @@ pci_uio_map_resource(struct rte_pci_device *dev)
 	}
 	dev->intr_handle.type = RTE_INTR_HANDLE_UIO;
 
+	snprintf(cfgname, sizeof(cfgname),
+			"/sys/class/uio/uio%u/device/config", uio_num);
+	dev->intr_handle.uio_cfg_fd = open(cfgname, O_RDWR);
+	if (dev->intr_handle.uio_cfg_fd < 0) {
+		RTE_LOG(ERR, EAL, "Cannot open %s: %s\n",
+			cfgname, strerror(errno));
+		return -1;
+	}
+
+	/* update devname for mmap  */
+	snprintf(devname, sizeof(devname),
+		SYSFS_PCI_DEVICES "/" PCI_PRI_FMT "/resource%d",
+		loc->domain, loc->bus, loc->devid, loc->function, 0);
+
+	/* set bus master that is not done by uio_pci_generic */
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+		if (pci_uio_set_bus_master(dev->intr_handle.uio_cfg_fd)) {
+			RTE_LOG(ERR, EAL, "Cannot set up bus mastering!\n");
+			return -1;
+		}
+	}
+
 	/* allocate the mapping details for secondary processes*/
 	uio_res = rte_zmalloc("UIO_RES", sizeof(*uio_res), 0);
 	if (uio_res == NULL) {
@@ -330,13 +361,12 @@ pci_uio_map_resource(struct rte_pci_device *dev)
 	memcpy(&uio_res->pci_addr, &dev->addr, sizeof(uio_res->pci_addr));
 
 	/* collect info about device mappings */
-	nb_maps = pci_uio_get_mappings(dirname, uio_res->maps,
-				       RTE_DIM(uio_res->maps));
+	nb_maps = pci_uio_get_mappings(dev, uio_res->maps,
+					RTE_DIM(uio_res->maps));
 	if (nb_maps < 0) {
 		rte_free(uio_res);
 		return nb_maps;
 	}
-
 	uio_res->nb_maps = nb_maps;
 
 	/* Map all BARs */
@@ -403,38 +433,3 @@ pci_uio_map_resource(struct rte_pci_device *dev)
 
 	return 0;
 }
-
-/*
- * parse a sysfs file containing one integer value
- * different to the eal version, as it needs to work with 64-bit values
- */
-static int
-pci_parse_sysfs_value(const char *filename, uint64_t *val)
-{
-	FILE *f;
-	char buf[BUFSIZ];
-	char *end = NULL;
-
-	f = fopen(filename, "r");
-	if (f == NULL) {
-		RTE_LOG(ERR, EAL, "%s(): cannot open sysfs value %s\n",
-				__func__, filename);
-		return -1;
-	}
-
-	if (fgets(buf, sizeof(buf), f) == NULL) {
-		RTE_LOG(ERR, EAL, "%s(): cannot read sysfs value %s\n",
-				__func__, filename);
-		fclose(f);
-		return -1;
-	}
-	*val = strtoull(buf, &end, 0);
-	if ((buf[0] == '\0') || (end == NULL) || (*end != '\n')) {
-		RTE_LOG(ERR, EAL, "%s(): cannot parse sysfs value %s\n",
-				__func__, filename);
-		fclose(f);
-		return -1;
-	}
-	fclose(f);
-	return 0;
-}
diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
index 23eafd9..6a159c7 100644
--- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
@@ -50,8 +50,12 @@ enum rte_intr_handle_type {
 
 /** Handle for interrupts. */
 struct rte_intr_handle {
-	int vfio_dev_fd;                 /**< VFIO device file descriptor */
-	int fd;                          /**< file descriptor */
+	union {
+		int vfio_dev_fd;  /**< VFIO device file descriptor */
+		int uio_cfg_fd;  /**< UIO config file descriptor
+					for uio_pci_generic */
+	};
+	int fd;	 /**< interrupt event file descriptor */
 	enum rte_intr_handle_type type;  /**< handle type */
 };
 
-- 
2.1.0

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [dpdk-dev] [PATCH v3 2/3] eal: add interrupt enable/disable routines for uio_pci_generic
  2015-02-20 16:59 ` [dpdk-dev] [PATCH v3 0/3] enable uio_pci_generic support Bruce Richardson
  2015-02-20 16:59   ` [dpdk-dev] [PATCH v3 1/3] eal: " Bruce Richardson
@ 2015-02-20 16:59   ` Bruce Richardson
  2015-02-20 16:59   ` [dpdk-dev] [PATCH v3 3/3] tools: enable binding NIC device to uio_pci_generic Bruce Richardson
  2015-02-20 17:44   ` [dpdk-dev] [PATCH v3 0/3] enable uio_pci_generic support Declan Doherty
  3 siblings, 0 replies; 13+ messages in thread
From: Bruce Richardson @ 2015-02-20 16:59 UTC (permalink / raw)
  To: dev

From: Zhou Danny <danny.zhou@intel.com>

enable/disable interrupt by manipulating a control bit of command
register on NIC's PCIe configuration space.

Signed-off-by: Danny Zhou <danny.zhou@intel.com>
Tested-by: Qun Wan <qun.wan@intel.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/librte_eal/linuxapp/eal/eal_interrupts.c | 68 +++++++++++++++++++++-------
 1 file changed, 52 insertions(+), 16 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
index dc2668a..8c5b834 100644
--- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
@@ -361,6 +361,54 @@ vfio_disable_msix(struct rte_intr_handle *intr_handle) {
 }
 #endif
 
+static int
+uio_intr_disable(struct rte_intr_handle *intr_handle)
+{
+	unsigned char command_high;
+
+	/* use UIO config file descriptor for uio_pci_generic */
+	if (pread(intr_handle->uio_cfg_fd, &command_high, 1, 5) != 1) {
+		RTE_LOG(ERR, EAL,
+			"Error reading interrupts status for fd %d\n",
+			intr_handle->uio_cfg_fd);
+		return -1;
+	}
+	/* disable interrupts */
+	command_high |= 0x4;
+	if (pwrite(intr_handle->uio_cfg_fd, &command_high, 1, 5) != 1) {
+		RTE_LOG(ERR, EAL,
+			"Error disabling interrupts for fd %d\n",
+			intr_handle->uio_cfg_fd);
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+uio_intr_enable(struct rte_intr_handle *intr_handle)
+{
+	unsigned char command_high;
+
+	/* use UIO config file descriptor for uio_pci_generic */
+	if (pread(intr_handle->uio_cfg_fd, &command_high, 1, 5) != 1) {
+		RTE_LOG(ERR, EAL,
+			"Error reading interrupts status for fd %d\n",
+			intr_handle->uio_cfg_fd);
+		return -1;
+	}
+	/* enable interrupts */
+	command_high &= ~0x4;
+	if (pwrite(intr_handle->uio_cfg_fd, &command_high, 1, 5) != 1) {
+		RTE_LOG(ERR, EAL,
+			"Error enabling interrupts for fd %d\n",
+			intr_handle->uio_cfg_fd);
+		return -1;
+	}
+
+	return 0;
+}
+
 int
 rte_intr_callback_register(struct rte_intr_handle *intr_handle,
 			rte_intr_callback_fn cb, void *cb_arg)
@@ -500,20 +548,14 @@ rte_intr_callback_unregister(struct rte_intr_handle *intr_handle,
 int
 rte_intr_enable(struct rte_intr_handle *intr_handle)
 {
-	const int value = 1;
-
-	if (!intr_handle || intr_handle->fd < 0)
+	if (!intr_handle || intr_handle->fd < 0 || intr_handle->uio_cfg_fd < 0)
 		return -1;
 
 	switch (intr_handle->type){
 	/* write to the uio fd to enable the interrupt */
 	case RTE_INTR_HANDLE_UIO:
-		if (write(intr_handle->fd, &value, sizeof(value)) < 0) {
-			RTE_LOG(ERR, EAL,
-				"Error enabling interrupts for fd %d\n",
-							intr_handle->fd);
+		if (uio_intr_enable(intr_handle))
 			return -1;
-		}
 		break;
 	/* not used at this moment */
 	case RTE_INTR_HANDLE_ALARM:
@@ -546,20 +588,14 @@ rte_intr_enable(struct rte_intr_handle *intr_handle)
 int
 rte_intr_disable(struct rte_intr_handle *intr_handle)
 {
-	const int value = 0;
-
-	if (!intr_handle || intr_handle->fd < 0)
+	if (!intr_handle || intr_handle->fd < 0 || intr_handle->uio_cfg_fd < 0)
 		return -1;
 
 	switch (intr_handle->type){
 	/* write to the uio fd to disable the interrupt */
 	case RTE_INTR_HANDLE_UIO:
-		if (write(intr_handle->fd, &value, sizeof(value)) < 0){
-			RTE_LOG(ERR, EAL,
-				"Error disabling interrupts for fd %d\n",
-							intr_handle->fd);
+		if (uio_intr_disable(intr_handle))
 			return -1;
-		}
 		break;
 	/* not used at this moment */
 	case RTE_INTR_HANDLE_ALARM:
-- 
2.1.0

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [dpdk-dev] [PATCH v3 3/3] tools: enable binding NIC device to uio_pci_generic
  2015-02-20 16:59 ` [dpdk-dev] [PATCH v3 0/3] enable uio_pci_generic support Bruce Richardson
  2015-02-20 16:59   ` [dpdk-dev] [PATCH v3 1/3] eal: " Bruce Richardson
  2015-02-20 16:59   ` [dpdk-dev] [PATCH v3 2/3] eal: add interrupt enable/disable routines for uio_pci_generic Bruce Richardson
@ 2015-02-20 16:59   ` Bruce Richardson
  2015-02-20 17:44   ` [dpdk-dev] [PATCH v3 0/3] enable uio_pci_generic support Declan Doherty
  3 siblings, 0 replies; 13+ messages in thread
From: Bruce Richardson @ 2015-02-20 16:59 UTC (permalink / raw)
  To: dev

From: Zhou Danny <danny.zhou@intel.com>

Add uio_pci_generic to the list of supported kernel drivers.

Signed-off-by: Danny Zhou <danny.zhou@intel.com>
Tested-by: Qun Wan <qun.wan@intel.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 tools/dpdk_nic_bind.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/dpdk_nic_bind.py b/tools/dpdk_nic_bind.py
index 812b6a1..2483056 100755
--- a/tools/dpdk_nic_bind.py
+++ b/tools/dpdk_nic_bind.py
@@ -43,7 +43,7 @@ ETHERNET_CLASS = "0200"
 # Each device within this is itself a dictionary of device properties
 devices = {}
 # list of supported DPDK drivers
-dpdk_drivers = [ "igb_uio", "vfio-pci" ]
+dpdk_drivers = [ "igb_uio", "vfio-pci", "uio_pci_generic" ]
 
 # command-line arg flags
 b_flag = None
-- 
2.1.0

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [dpdk-dev] [PATCH v3 0/3] enable uio_pci_generic support
  2015-02-20 16:59 ` [dpdk-dev] [PATCH v3 0/3] enable uio_pci_generic support Bruce Richardson
                     ` (2 preceding siblings ...)
  2015-02-20 16:59   ` [dpdk-dev] [PATCH v3 3/3] tools: enable binding NIC device to uio_pci_generic Bruce Richardson
@ 2015-02-20 17:44   ` Declan Doherty
  2015-02-20 22:35     ` Thomas Monjalon
  3 siblings, 1 reply; 13+ messages in thread
From: Declan Doherty @ 2015-02-20 17:44 UTC (permalink / raw)
  To: Bruce Richardson, dev

On 20/02/15 16:59, Bruce Richardson wrote:
> v3 changes
> - made processing of uio devices identical, irrespective of igb_uio or uio_pci_generic
> - removed storage of kernel driver name from dev node as now unneeded.
>
> v2 changes:
> - Change variable name of kernel driver with precise comment
> - Fix a union definition error in v1 patchset
> - Move redefined macro IORESOURCE_MEM to rte_pci.h with comment
>
> Linux kernel provides UIO as well as VFIO mechanism to support writing user
> space device driver. Comparing to UIO which is available since 2.6.32 kernel,
> the VFIO is introduced into kernel since version 3.6.0 with better interrupt
> and memory protection (build on top of Intel VT-d technology) supports.
> Basically, UIO and VFIO do two common things below:
> 1) Map PCIe device's I/O memory space to user space driver
> 2) Support device interrupt notification mechanism that notifies user space
>     driver/application when a device interrupt triggers.
>
> To run an DPDK application and make use of VFIO, two in_kernel modules
> vfio and vfio-pci module must be loaded. To use UIO, a DPDK kernel
> module igb_uio, which was there since DPDK is invented, must be loaded to
> attach to in_kernel uio module. As an solution to deprecate igb_uio,
> this patch serials leverage the uio_pci_generic in_kernel module to support
> DPDK user space PMD in a generic fashion (similar to how VFIO works), to
> remove user space DPDK dependency on GPL code igb_uio in kernel.
>
> Example to bind Network Ports to uio_pci_generic:
>             modprobe uio
>             modprobe uio_pci_generic
>             /* to bind device 08:00.0, to the uio_pci_generic driver */
>             ./tools/dpdk_nic_bind.py -b uio_pci_generic 08:00.0
>
> Note: this patch set does not remove igb_uio support due to igb_uio supports
> creating maximum number of SR-IOV VFs (Virtual Functions) by using max_vfs
> kernel parameter on older kernels (kernel 3.7.x and below).
> Specifically, igb_uio explicitly calls pci_enable_sriov() to create VFs, while
> it is not invoked in either uio or uio_pci_generic kernel modules. On kernel 3.8.x
> and above, user can use the standard sysfs to enable VFs. For examples:
>
> #echo $num_vf_enabled > /sys/class/net/$dev/device/sriov_numvfs  // enable VFs
> #echo 0 > /sys/class/net/$dev/device/sriov_numvfs                // disable VFs
>
>
> Zhou Danny (3):
>    eal: enable uio_pci_generic support
>    eal: add interrupt enable/disable routines for uio_pci_generic
>    tools: enable binding NIC device to uio_pci_generic
>
>   lib/librte_eal/common/include/rte_pci.h            |   3 +
>   lib/librte_eal/linuxapp/eal/eal_interrupts.c       |  68 +++++++--
>   lib/librte_eal/linuxapp/eal/eal_pci.c              |   4 +-
>   lib/librte_eal/linuxapp/eal/eal_pci_uio.c          | 167 ++++++++++-----------
>   .../linuxapp/eal/include/exec-env/rte_interrupts.h |   8 +-
>   tools/dpdk_nic_bind.py                             |   2 +-
>   6 files changed, 144 insertions(+), 108 deletions(-)
>


Series Acked-by: Declan Doherty <declan.doherty@intel.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [dpdk-dev] [PATCH v3 0/3] enable uio_pci_generic support
  2015-02-20 17:44   ` [dpdk-dev] [PATCH v3 0/3] enable uio_pci_generic support Declan Doherty
@ 2015-02-20 22:35     ` Thomas Monjalon
  0 siblings, 0 replies; 13+ messages in thread
From: Thomas Monjalon @ 2015-02-20 22:35 UTC (permalink / raw)
  To: Bruce Richardson, danny.zhou; +Cc: dev

> > v3 changes
> > - made processing of uio devices identical, irrespective of igb_uio or uio_pci_generic
> > - removed storage of kernel driver name from dev node as now unneeded.
> >
> > v2 changes:
> > - Change variable name of kernel driver with precise comment
> > - Fix a union definition error in v1 patchset
> > - Move redefined macro IORESOURCE_MEM to rte_pci.h with comment
> >
> > Linux kernel provides UIO as well as VFIO mechanism to support writing user
> > space device driver. Comparing to UIO which is available since 2.6.32 kernel,
> > the VFIO is introduced into kernel since version 3.6.0 with better interrupt
> > and memory protection (build on top of Intel VT-d technology) supports.
> > Basically, UIO and VFIO do two common things below:
> > 1) Map PCIe device's I/O memory space to user space driver
> > 2) Support device interrupt notification mechanism that notifies user space
> >     driver/application when a device interrupt triggers.
> >
> > To run an DPDK application and make use of VFIO, two in_kernel modules
> > vfio and vfio-pci module must be loaded. To use UIO, a DPDK kernel
> > module igb_uio, which was there since DPDK is invented, must be loaded to
> > attach to in_kernel uio module. As an solution to deprecate igb_uio,
> > this patch serials leverage the uio_pci_generic in_kernel module to support
> > DPDK user space PMD in a generic fashion (similar to how VFIO works), to
> > remove user space DPDK dependency on GPL code igb_uio in kernel.
> >
> > Example to bind Network Ports to uio_pci_generic:
> >             modprobe uio
> >             modprobe uio_pci_generic
> >             /* to bind device 08:00.0, to the uio_pci_generic driver */
> >             ./tools/dpdk_nic_bind.py -b uio_pci_generic 08:00.0
> >
> > Note: this patch set does not remove igb_uio support due to igb_uio supports
> > creating maximum number of SR-IOV VFs (Virtual Functions) by using max_vfs
> > kernel parameter on older kernels (kernel 3.7.x and below).
> > Specifically, igb_uio explicitly calls pci_enable_sriov() to create VFs, while
> > it is not invoked in either uio or uio_pci_generic kernel modules. On kernel 3.8.x
> > and above, user can use the standard sysfs to enable VFs. For examples:
> >
> > #echo $num_vf_enabled > /sys/class/net/$dev/device/sriov_numvfs  // enable VFs
> > #echo 0 > /sys/class/net/$dev/device/sriov_numvfs                // disable VFs
> >
> >
> > Zhou Danny (3):
> >    eal: enable uio_pci_generic support
> >    eal: add interrupt enable/disable routines for uio_pci_generic
> >    tools: enable binding NIC device to uio_pci_generic
> 
> Series Acked-by: Declan Doherty <declan.doherty@intel.com>

Applied, thanks

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/3] eal: enable uio_pci_generic support
  2015-02-20 16:59   ` [dpdk-dev] [PATCH v3 1/3] eal: " Bruce Richardson
@ 2015-02-23 15:24     ` David Marchand
  0 siblings, 0 replies; 13+ messages in thread
From: David Marchand @ 2015-02-23 15:24 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev

Hello,

Ok this is coming too late, but anyway, my comments.


On Fri, Feb 20, 2015 at 5:59 PM, Bruce Richardson <
bruce.richardson@intel.com> wrote:

[snip]

diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
> b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
> index 54cce08..2b16fcb 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
> @@ -36,6 +36,7 @@
>  #include <dirent.h>
>  #include <sys/stat.h>
>  #include <sys/mman.h>
> +#include <linux/pci_regs.h>
>
>  #include <rte_log.h>
>  #include <rte_pci.h>
> @@ -47,71 +48,73 @@
>  #include "eal_filesystem.h"
>  #include "eal_pci_init.h"
>
> -static int pci_parse_sysfs_value(const char *filename, uint64_t *val);
> -
>  void *pci_map_addr = NULL;
>
>
>  #define OFF_MAX              ((uint64_t)(off_t)-1)
>  static int
> -pci_uio_get_mappings(const char *devname, struct pci_map maps[], int
> nb_maps)
> +pci_uio_get_mappings(struct rte_pci_device *dev,
> +                       struct pci_map maps[], int nb_maps)
>  {
> -       int i;
> -       char dirname[PATH_MAX];
> +       struct rte_pci_addr *loc = &dev->addr;
> +       int i = 0;
>         char filename[PATH_MAX];
> -       uint64_t offset, size;
> +       unsigned long long start_addr, end_addr, flags;
> +       FILE *f;
>
> -       for (i = 0; i != nb_maps; i++) {
> +       snprintf(filename, sizeof(filename),
> +               SYSFS_PCI_DEVICES "/" PCI_PRI_FMT "/resource",
> +               loc->domain, loc->bus, loc->devid, loc->function);
>
> -               /* check if map directory exists */
> -               snprintf(dirname, sizeof(dirname),
> -                       "%s/maps/map%u", devname, i);
> -
> -               if (access(dirname, F_OK) != 0)
> -                       break;
> +       f = fopen(filename, "r");
> +       if (f == NULL) {
> +               RTE_LOG(ERR, EAL,
> +               "%s(): cannot open sysfs %s\n",
> +               __func__, filename);
> +               return -1;
> +       }
>
> -               /* get mapping offset */
> -               snprintf(filename, sizeof(filename),
> -                       "%s/offset", dirname);
> -               if (pci_parse_sysfs_value(filename, &offset) < 0) {
> -                       RTE_LOG(ERR, EAL,
> -                               "%s(): cannot parse offset of %s\n",
> -                               __func__, dirname);
> -                       return -1;
> +       while (fscanf(f, "%llx %llx %llx", &start_addr,
> +                       &end_addr, &flags) == 3 && i < nb_maps) {
> +               if (flags & IORESOURCE_MEM) {
> +                       maps[i].offset = 0x0;
> +                       maps[i].size = end_addr - start_addr + 1;
> +                       maps[i].phaddr = start_addr;
> +                       i++;
>                 }
> +       }
> +       fclose(f);
>
> -               /* get mapping size */
> -               snprintf(filename, sizeof(filename),
> -                       "%s/size", dirname);
> -               if (pci_parse_sysfs_value(filename, &size) < 0) {
> -                       RTE_LOG(ERR, EAL,
> -                               "%s(): cannot parse size of %s\n",
> -                               __func__, dirname);
> -                       return -1;
> -               }
> +       return i;
> +}
>
> -               /* get mapping physical address */
> -               snprintf(filename, sizeof(filename),
> -                       "%s/addr", dirname);
> -               if (pci_parse_sysfs_value(filename, &maps[i].phaddr) < 0) {
> -                       RTE_LOG(ERR, EAL,
> -                               "%s(): cannot parse addr of %s\n",
> -                               __func__, dirname);
> -                       return -1;
> -               }
>

This function ends up reinventing the wheel from eal_pci.c plus it adds
some new array with mappings in them but not indexed the same way as
eal_pci.c see comments at the end of this mail.


> +static int
> +pci_uio_set_bus_master(int dev_fd)
> +{
> +       uint16_t reg;
> +       int ret;
>
> -               if ((offset > OFF_MAX) || (size > SIZE_MAX)) {
> -                       RTE_LOG(ERR, EAL,
> -                               "%s(): offset/size exceed system max
> value\n",
> -                               __func__);
> -                       return -1;
> -               }
> +       ret = pread(dev_fd, &reg, sizeof(reg), PCI_COMMAND);
> +       if (ret != sizeof(reg)) {
> +               RTE_LOG(ERR, EAL,
> +                       "Cannot read command from PCI config space!\n");
> +               return -1;
> +       }
> +
> +       /* return if bus mastering is already on */
> +       if (reg & PCI_COMMAND_MASTER)
> +               return 0;
> +
> +       reg |= PCI_COMMAND_MASTER;
>
> -               maps[i].offset = offset;
> -               maps[i].size = size;
> +       ret = pwrite(dev_fd, &reg, sizeof(reg), PCI_COMMAND);
> +       if (ret != sizeof(reg)) {
> +               RTE_LOG(ERR, EAL,
> +                       "Cannot write command to PCI config space!\n");
> +               return -1;
>         }
>
> -       return i;
> +       return 0;
>  }
>

It would have been the good time to have a generic api to access pci config
space.
Something like what Stephen proposed.


>  static int
> @@ -127,6 +130,10 @@ pci_uio_map_secondary(struct rte_pci_device *dev)
>                         continue;
>
>                 for (i = 0; i != uio_res->nb_maps; i++) {
> +                       /* ignore mappings unused in primary process */
> +                       if (uio_res->maps[i].addr == NULL)
> +                               continue;
> +

                        /*
>                          * open devname, to mmap it
>                          */
> @@ -282,6 +289,7 @@ pci_uio_map_resource(struct rte_pci_device *dev)
>  {
>         int i, j;
>         char dirname[PATH_MAX];
> +       char cfgname[PATH_MAX];
>         char devname[PATH_MAX]; /* contains the /dev/uioX */
>         void *mapaddr;
>         int uio_num;
> @@ -294,6 +302,7 @@ pci_uio_map_resource(struct rte_pci_device *dev)
>         struct pci_map *maps;
>
>         dev->intr_handle.fd = -1;
> +       dev->intr_handle.uio_cfg_fd = -1;
>         dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN;
>
>         /* secondary processes - use already recorded details */
> @@ -318,6 +327,28 @@ pci_uio_map_resource(struct rte_pci_device *dev)
>         }
>         dev->intr_handle.type = RTE_INTR_HANDLE_UIO;
>
> +       snprintf(cfgname, sizeof(cfgname),
> +                       "/sys/class/uio/uio%u/device/config", uio_num);
> +       dev->intr_handle.uio_cfg_fd = open(cfgname, O_RDWR);
> +       if (dev->intr_handle.uio_cfg_fd < 0) {
> +               RTE_LOG(ERR, EAL, "Cannot open %s: %s\n",
> +                       cfgname, strerror(errno));
> +               return -1;
> +       }
> +
> +       /* update devname for mmap  */
> +       snprintf(devname, sizeof(devname),
> +               SYSFS_PCI_DEVICES "/" PCI_PRI_FMT "/resource%d",
> +               loc->domain, loc->bus, loc->devid, loc->function, 0);
>

Why bother with a %d if you hardcode 0 ?
More importantly, since you hardcode "devname" to /sys/.../resource0, then
the mmap code will use a fd on this file.
I really am skeptical on the fact that it can work for devices that have no
bar0.

Then, how is this supposed to work ?

You rely on sysfs mmap code for pci resources.
Is this really equivalent to uio mmap operations ?



> +
> +       /* set bus master that is not done by uio_pci_generic */
> +       if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> +               if (pci_uio_set_bus_master(dev->intr_handle.uio_cfg_fd)) {
> +                       RTE_LOG(ERR, EAL, "Cannot set up bus
> mastering!\n");
> +                       return -1;
> +               }
> +       }
> +
>

You are already running in a primary process, this check is useless.


>         /* allocate the mapping details for secondary processes*/
>         uio_res = rte_zmalloc("UIO_RES", sizeof(*uio_res), 0);
>         if (uio_res == NULL) {
> @@ -330,13 +361,12 @@ pci_uio_map_resource(struct rte_pci_device *dev)
>         memcpy(&uio_res->pci_addr, &dev->addr, sizeof(uio_res->pci_addr));
>
>         /* collect info about device mappings */
> -       nb_maps = pci_uio_get_mappings(dirname, uio_res->maps,
> -                                      RTE_DIM(uio_res->maps));
> +       nb_maps = pci_uio_get_mappings(dev, uio_res->maps,
> +                                       RTE_DIM(uio_res->maps));
>         if (nb_maps < 0) {
>                 rte_free(uio_res);
>                 return nb_maps;
>         }
> -
>         uio_res->nb_maps = nb_maps;
>
>         /* Map all BARs */
>


Ok then after this, we use this temp array uio_res->maps and we loop all
over the pci resources.
Why do we have all these loops ?

I could see no point before, and with this change, it is bothering me again.
Won't it be easier to loop on the pci resources discovered by eal_pci.c
before this function is called ?

eal_pci.c is responsible for discovering pci devices, prepare those devices
(filling mem_resource[] for example), then eal_pci_uio.c / eal_pci_vfio.c
do the "mapping" stuff.
So uio / vfio must not overwrite what has already been set before.


-- 
David Marchand

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2015-02-23 15:24 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-19 17:08 [dpdk-dev] [PATCH v2 0/3] Enable uio_pci_generic support Zhou Danny
2015-02-19 17:08 ` [dpdk-dev] [PATCH v2 1/3] eal: enable " Zhou Danny
2015-02-20  9:01   ` Thomas Monjalon
2015-02-20 10:15     ` Bruce Richardson
2015-02-19 17:08 ` [dpdk-dev] [PATCH v2 2/3] eal: add interrupt enable/disable routines for uio_pci_generic Zhou Danny
2015-02-19 17:08 ` [dpdk-dev] [PATCH v2 3/3] tools: enable binding NIC device to uio_pci_generic Zhou Danny
2015-02-20 16:59 ` [dpdk-dev] [PATCH v3 0/3] enable uio_pci_generic support Bruce Richardson
2015-02-20 16:59   ` [dpdk-dev] [PATCH v3 1/3] eal: " Bruce Richardson
2015-02-23 15:24     ` David Marchand
2015-02-20 16:59   ` [dpdk-dev] [PATCH v3 2/3] eal: add interrupt enable/disable routines for uio_pci_generic Bruce Richardson
2015-02-20 16:59   ` [dpdk-dev] [PATCH v3 3/3] tools: enable binding NIC device to uio_pci_generic Bruce Richardson
2015-02-20 17:44   ` [dpdk-dev] [PATCH v3 0/3] enable uio_pci_generic support Declan Doherty
2015-02-20 22:35     ` Thomas Monjalon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).