DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH 0/5] receive IRQ related patches
@ 2015-05-18 17:40 Stephen Hemminger
  2015-05-18 17:40 ` [dpdk-dev] [PATCH 1/5] ethdev: check for rxq interrupt support Stephen Hemminger
                   ` (5 more replies)
  0 siblings, 6 replies; 11+ messages in thread
From: Stephen Hemminger @ 2015-05-18 17:40 UTC (permalink / raw)
  To: cumming.lian; +Cc: dev

These are some of the patches to enhance the still as not yet
merged receive interrupt functionality.

The big piece is support of UIO-MSI interrupts which is required
to make the virtio and vmxnet3 receive IRQ functionality work.
After this piece is reviewed, I will send those bits.

Stephen Hemminger (5):
  ethdev: check for rxq interrupt support
  ethdev: remove unnecessary checks
  ethdev: fix errors if RTE_ETHDEV_DEBUG enabled
  uio: new driver with MSI-X support
  uio: integrate MSI-X support

 config/common_linuxapp                             |   1 +
 lib/librte_eal/common/include/rte_pci.h            |   1 +
 lib/librte_eal/linuxapp/Makefile                   |   3 +
 lib/librte_eal/linuxapp/eal/eal_interrupts.c       |  94 +++++-
 lib/librte_eal/linuxapp/eal/eal_pci.c              |   4 +
 lib/librte_eal/linuxapp/eal/eal_pci_uio.c          |  59 +++-
 lib/librte_eal/linuxapp/eal/eal_uio_msi.h          |  26 ++
 .../linuxapp/eal/include/exec-env/rte_interrupts.h |   1 +
 lib/librte_eal/linuxapp/uio_msi/Makefile           |  13 +
 lib/librte_eal/linuxapp/uio_msi/uio_msi.c          | 365 +++++++++++++++++++++
 lib/librte_eal/linuxapp/uio_msi/uio_msi.h          |  22 ++
 lib/librte_ether/rte_ethdev.c                      |  29 +-
 tools/dpdk_nic_bind.py                             |   2 +-
 13 files changed, 580 insertions(+), 40 deletions(-)
 create mode 100644 lib/librte_eal/linuxapp/eal/eal_uio_msi.h
 create mode 100644 lib/librte_eal/linuxapp/uio_msi/Makefile
 create mode 100644 lib/librte_eal/linuxapp/uio_msi/uio_msi.c
 create mode 100644 lib/librte_eal/linuxapp/uio_msi/uio_msi.h

-- 
2.1.4

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [dpdk-dev] [PATCH 1/5] ethdev: check for rxq interrupt support
  2015-05-18 17:40 [dpdk-dev] [PATCH 0/5] receive IRQ related patches Stephen Hemminger
@ 2015-05-18 17:40 ` Stephen Hemminger
  2015-05-18 17:40 ` [dpdk-dev] [PATCH 2/5] ethdev: remove unnecessary checks Stephen Hemminger
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Stephen Hemminger @ 2015-05-18 17:40 UTC (permalink / raw)
  To: cumming.lian; +Cc: dev

Not all devices support rxq interrupt yet.
It is better to check for interrupt support in driver at configuration
time than waiting for later failures.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 lib/librte_ether/rte_ethdev.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index cb586ff..ad15837 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -1183,6 +1183,14 @@ rte_eth_dev_configure(uint8_t port_id, uint16_t nb_rx_q, uint16_t nb_tx_q,
 	}
 
 	/*
+	 * If Receive Queue interrupt is enabled, check that
+	 * the device supports interrupt control.
+	 */
+	if (dev_conf->intr_conf.rxq == 1)
+		FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_intr_enable,
+				    -EINVAL);
+
+	/*
 	 * If jumbo frames are enabled, check that the maximum RX packet
 	 * length is supported by the configured device.
 	 */
-- 
2.1.4

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [dpdk-dev] [PATCH 2/5] ethdev: remove unnecessary checks
  2015-05-18 17:40 [dpdk-dev] [PATCH 0/5] receive IRQ related patches Stephen Hemminger
  2015-05-18 17:40 ` [dpdk-dev] [PATCH 1/5] ethdev: check for rxq interrupt support Stephen Hemminger
@ 2015-05-18 17:40 ` Stephen Hemminger
  2015-05-18 17:40 ` [dpdk-dev] [PATCH 3/5] ethdev: fix errors if RTE_ETHDEV_DEBUG enabled Stephen Hemminger
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Stephen Hemminger @ 2015-05-18 17:40 UTC (permalink / raw)
  To: cumming.lian; +Cc: dev

Since the code has just called rte_eth_dev_is_valid_port
the following checks are unnecessary.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 lib/librte_ether/rte_ethdev.c | 16 ----------------
 1 file changed, 16 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index ad15837..7789338 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -3305,10 +3305,6 @@ rte_eth_dev_rx_intr_ctl(uint8_t port_id, int epfd, int op, void *data)
 	}
 
 	dev = &rte_eth_devices[port_id];
-	if (dev == NULL) {
-		PMD_DEBUG_TRACE("Invalid port device\n");
-		return -ENODEV;
-	}
 
 	intr_handle = &dev->pci_dev->intr_handle;
 	if (!intr_handle->intr_vec) {
@@ -3350,10 +3346,6 @@ rte_eth_dev_rx_intr_ctl_q(uint8_t port_id, uint16_t queue_id,
 	}
 
 	dev = &rte_eth_devices[port_id];
-	if (dev == NULL) {
-		PMD_DEBUG_TRACE("Invalid port device\n");
-		return -ENODEV;
-	}
 
 	if (queue_id >= dev->data->nb_rx_queues) {
 		PMD_DEBUG_TRACE("Invalid RX queue_id=%d\n", rx_queue_id);
@@ -3391,10 +3383,6 @@ rte_eth_dev_rx_intr_enable(uint8_t port_id,
 	}
 
 	dev = &rte_eth_devices[port_id];
-	if (dev == NULL) {
-		PMD_DEBUG_TRACE("Invalid port device\n");
-		return -ENODEV;
-	}
 
 	FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_intr_enable, -ENOTSUP);
 	return (*dev->dev_ops->rx_queue_intr_enable)(dev, queue_id);
@@ -3412,10 +3400,6 @@ rte_eth_dev_rx_intr_disable(uint8_t port_id,
 	}
 
 	dev = &rte_eth_devices[port_id];
-	if (dev == NULL) {
-		PMD_DEBUG_TRACE("Invalid port device\n");
-		return -ENODEV;
-	}
 
 	FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_intr_disable, -ENOTSUP);
 	return (*dev->dev_ops->rx_queue_intr_disable)(dev, queue_id);
-- 
2.1.4

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [dpdk-dev] [PATCH 3/5] ethdev: fix errors if RTE_ETHDEV_DEBUG enabled
  2015-05-18 17:40 [dpdk-dev] [PATCH 0/5] receive IRQ related patches Stephen Hemminger
  2015-05-18 17:40 ` [dpdk-dev] [PATCH 1/5] ethdev: check for rxq interrupt support Stephen Hemminger
  2015-05-18 17:40 ` [dpdk-dev] [PATCH 2/5] ethdev: remove unnecessary checks Stephen Hemminger
@ 2015-05-18 17:40 ` Stephen Hemminger
  2015-05-18 17:40 ` [dpdk-dev] [PATCH 4/5] uio: new driver with MSI-X support Stephen Hemminger
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Stephen Hemminger @ 2015-05-18 17:40 UTC (permalink / raw)
  To: cumming.lian; +Cc: dev

The interrupt mode patches introduced some obvious errors
if RTE_ETHDEV_DEBUG is defined.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 lib/librte_ether/rte_ethdev.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 7789338..cf9a79a 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -3348,13 +3348,13 @@ rte_eth_dev_rx_intr_ctl_q(uint8_t port_id, uint16_t queue_id,
 	dev = &rte_eth_devices[port_id];
 
 	if (queue_id >= dev->data->nb_rx_queues) {
-		PMD_DEBUG_TRACE("Invalid RX queue_id=%d\n", rx_queue_id);
+		PMD_DEBUG_TRACE("Invalid RX queue_id=%d\n", queue_id);
 		return -EINVAL;
 	}
 
 	intr_handle = &dev->pci_dev->intr_handle;
 	if (!intr_handle->intr_vec || intr_handle->intr_vec[queue_id] < 0) {
-		PMD_DEBUG_TRACE("RX Intr vector unset on %d\n", rx_queue_id);
+		PMD_DEBUG_TRACE("RX Intr vector unset on %d\n", queue_id);
 		return -EPERM;
 	}
 
-- 
2.1.4

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [dpdk-dev] [PATCH 4/5] uio: new driver with MSI-X support
  2015-05-18 17:40 [dpdk-dev] [PATCH 0/5] receive IRQ related patches Stephen Hemminger
                   ` (2 preceding siblings ...)
  2015-05-18 17:40 ` [dpdk-dev] [PATCH 3/5] ethdev: fix errors if RTE_ETHDEV_DEBUG enabled Stephen Hemminger
@ 2015-05-18 17:40 ` Stephen Hemminger
  2015-05-25  6:01   ` Liang, Cunming
  2015-05-18 17:40 ` [dpdk-dev] [PATCH 5/5] uio: integrate " Stephen Hemminger
  2015-07-09  0:28 ` [dpdk-dev] [PATCH 0/5] receive IRQ related patches Thomas Monjalon
  5 siblings, 1 reply; 11+ messages in thread
From: Stephen Hemminger @ 2015-05-18 17:40 UTC (permalink / raw)
  To: cumming.lian; +Cc: dev

This is a merge of igb_uio with the MSI-X support through
eventfd (similar to VFIO). The driver requires a small change to
upstream UIO driver to allow UIO drivers to support ioctl's.

See:
http://marc.info/?l=linux-kernel&m=143197030217434&w=2
http://www.spinics.net/lists/kernel/msg1993359.html

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 config/common_linuxapp                    |   1 +
 lib/librte_eal/linuxapp/Makefile          |   3 +
 lib/librte_eal/linuxapp/uio_msi/Makefile  |  13 ++
 lib/librte_eal/linuxapp/uio_msi/uio_msi.c | 365 ++++++++++++++++++++++++++++++
 lib/librte_eal/linuxapp/uio_msi/uio_msi.h |  22 ++
 5 files changed, 404 insertions(+)
 create mode 100644 lib/librte_eal/linuxapp/uio_msi/Makefile
 create mode 100644 lib/librte_eal/linuxapp/uio_msi/uio_msi.c
 create mode 100644 lib/librte_eal/linuxapp/uio_msi/uio_msi.h

diff --git a/config/common_linuxapp b/config/common_linuxapp
index 0078dc9..8299efe 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -100,6 +100,7 @@ CONFIG_RTE_EAL_ALLOW_INV_SOCKET_ID=n
 CONFIG_RTE_EAL_ALWAYS_PANIC_ON_ERROR=n
 CONFIG_RTE_EAL_IGB_UIO=y
 CONFIG_RTE_EAL_VFIO=y
+CONFIG_RTE_EAL_UIO_MSI=y
 
 #
 # Special configurations in PCI Config Space for high performance
diff --git a/lib/librte_eal/linuxapp/Makefile b/lib/librte_eal/linuxapp/Makefile
index 8fcfdf6..d283952 100644
--- a/lib/librte_eal/linuxapp/Makefile
+++ b/lib/librte_eal/linuxapp/Makefile
@@ -34,6 +34,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
 ifeq ($(CONFIG_RTE_EAL_IGB_UIO),y)
 DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += igb_uio
 endif
+ifeq ($(CONFIG_RTE_EAL_UIO_MSI),y)
+DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += uio_msi
+endif
 DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal
 ifeq ($(CONFIG_RTE_LIBRTE_KNI),y)
 DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += kni
diff --git a/lib/librte_eal/linuxapp/uio_msi/Makefile b/lib/librte_eal/linuxapp/uio_msi/Makefile
new file mode 100644
index 0000000..275174c
--- /dev/null
+++ b/lib/librte_eal/linuxapp/uio_msi/Makefile
@@ -0,0 +1,13 @@
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+MODULE = uio_msi
+MODULE_PATH = drivers/uio/uio_msi
+
+MODULE_CFLAGS += -I$(SRCDIR)
+MODULE_CFLAGS += -I$(RTE_OUTPUT)/include
+MODULE_CFLAGS += -Winline -Wall -Werror
+
+SRCS-y := uio_msi.c
+
+include $(RTE_SDK)/mk/rte.module.mk
diff --git a/lib/librte_eal/linuxapp/uio_msi/uio_msi.c b/lib/librte_eal/linuxapp/uio_msi/uio_msi.c
new file mode 100644
index 0000000..7b1dcea
--- /dev/null
+++ b/lib/librte_eal/linuxapp/uio_msi/uio_msi.c
@@ -0,0 +1,365 @@
+/*-
+ * GPL LICENSE SUMMARY
+ *
+ * Copyright (c) 2015 by Brocade Communications Systems, Inc.
+ * All rights reserved.
+ *
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/device.h>
+#include <linux/interrupt.h>
+#include <linux/eventfd.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/uio_driver.h>
+#include <linux/io.h>
+#include <linux/msi.h>
+#include <linux/version.h>
+
+#include "uio_msi.h"
+
+#define DRIVER_VERSION	"0.1.0"
+#define NON_Q_VECTORS	1
+
+/* MSI-X vector information */
+struct uio_msi_pci_dev {
+	struct uio_info info;		/* UIO driver info */
+	struct pci_dev *pdev;		/* PCI device */
+	struct mutex	mutex;		/* open/release/ioctl mutex */
+	int		ref_cnt;	/* references to device */
+	u16    		num_vectors;	/* How many MSI-X slots are used */
+	struct msix_entry *msix;	/* MSI-x vector table */
+	struct uio_msi_irq_ctx {
+		struct eventfd_ctx *trigger; /* MSI-x vector to eventfd */
+		char *name;		/* name in /proc/interrupts */
+	} *ctx;
+};
+
+static unsigned int max_vectors = 33;
+module_param(max_vectors, uint, 0);
+MODULE_PARM_DESC(max_vectors, "Upper limit on # of MSI-X vectors used");
+
+static irqreturn_t uio_msi_irqhandler(int irq, void *arg)
+{
+	struct eventfd_ctx *trigger = arg;
+
+	pr_devel("irq %u trigger %p\n", irq, trigger);
+
+	eventfd_signal(trigger, 1);
+	return IRQ_HANDLED;
+}
+
+/* set the mapping between vector # and existing eventfd. */
+static int set_irq_eventfd(struct uio_msi_pci_dev *udev, u32 vec, int fd)
+{
+	struct uio_msi_irq_ctx *ctx;
+	struct eventfd_ctx *trigger;
+	int irq, err;
+
+	if (vec >= udev->num_vectors) {
+		dev_notice(&udev->pdev->dev, "vec %u >= num_vec %u\n",
+			   vec, udev->num_vectors);
+		return -ERANGE;
+	}
+
+	irq = udev->msix[vec].vector;
+
+	/* Clearup existing irq mapping */
+	ctx = &udev->ctx[vec];
+	if (ctx->trigger) {
+		free_irq(irq, ctx->trigger);
+		eventfd_ctx_put(ctx->trigger);
+		ctx->trigger = NULL;
+	}
+
+	/* Passing -1 is used to disable interrupt */
+	if (fd < 0)
+		return 0;
+
+
+	trigger = eventfd_ctx_fdget(fd);
+	if (IS_ERR(trigger)) {
+		err = PTR_ERR(trigger);
+		dev_notice(&udev->pdev->dev,
+			   "eventfd ctx get failed: %d\n", err);
+		return err;
+	}
+
+	err = request_irq(irq, uio_msi_irqhandler, 0, ctx->name, trigger);
+	if (err) {
+		dev_notice(&udev->pdev->dev,
+			   "request irq failed: %d\n", err);
+		eventfd_ctx_put(trigger);
+		return err;
+	}
+
+	dev_dbg(&udev->pdev->dev, "map vector %u to fd %d trigger %p\n",
+		  vec, fd, trigger);
+	ctx->trigger = trigger;
+	return 0;
+}
+
+static int
+uio_msi_ioctl(struct uio_info *info, unsigned int cmd, unsigned long arg)
+{
+	struct uio_msi_pci_dev *udev
+		= container_of(info, struct uio_msi_pci_dev, info);
+	struct uio_msi_irq_set hdr;
+	int err;
+
+	switch (cmd) {
+	case UIO_MSI_IRQ_SET:
+		if (copy_from_user(&hdr, (void __user *)arg, sizeof(hdr)))
+			return -EFAULT;
+
+		mutex_lock(&udev->mutex);
+		err = set_irq_eventfd(udev, hdr.vec, hdr.fd);
+		mutex_unlock(&udev->mutex);
+		break;
+	default:
+		err = -EOPNOTSUPP;
+	}
+	return err;
+}
+
+/* Opening the UIO device for first time enables MSI-X */
+static int
+uio_msi_open(struct uio_info *info, struct inode *inode)
+{
+	struct uio_msi_pci_dev *udev
+		= container_of(info, struct uio_msi_pci_dev, info);
+	int err = 0;
+
+	mutex_lock(&udev->mutex);
+	if (udev->ref_cnt++ == 0)
+		err = pci_enable_msix(udev->pdev, udev->msix,
+				      udev->num_vectors);
+	mutex_unlock(&udev->mutex);
+
+	return err;
+}
+
+/* Last close of the UIO device releases/disables all IRQ's */
+static int
+uio_msi_release(struct uio_info *info, struct inode *inode)
+{
+	struct uio_msi_pci_dev *udev
+		= container_of(info, struct uio_msi_pci_dev, info);
+
+	mutex_lock(&udev->mutex);
+	if (--udev->ref_cnt == 0) {
+		int i;
+
+		for (i = 0; i < udev->num_vectors; i++) {
+			struct uio_msi_irq_ctx *ctx = &udev->ctx[i];
+
+			if (!ctx->trigger)
+				continue;
+
+			free_irq(udev->msix[i].vector, ctx->trigger);
+			eventfd_ctx_put(ctx->trigger);
+			ctx->trigger = NULL;
+		}
+		pci_disable_msix(udev->pdev);
+	}
+	mutex_unlock(&udev->mutex);
+
+	return 0;
+}
+
+/* Unmap previously ioremap'd resources */
+static void
+release_iomaps(struct uio_mem *mem)
+{
+	int i;
+
+	for (i = 0; i < MAX_UIO_MAPS; i++, mem++) {
+		if (mem->internal_addr)
+			iounmap(mem->internal_addr);
+	}
+}
+
+static int
+setup_maps(struct pci_dev *pdev, struct uio_info *info)
+{
+	int i, m = 0, p = 0, err;
+	static const char * const bar_names[] = {
+		"BAR0",	"BAR1",	"BAR2",	"BAR3",	"BAR4",	"BAR5",
+	};
+
+	for (i = 0; i < ARRAY_SIZE(bar_names); i++) {
+		unsigned long start = pci_resource_start(pdev, i);
+		unsigned long flags = pci_resource_flags(pdev, i);
+		unsigned long len = pci_resource_len(pdev, i);
+
+		if (start == 0 || len == 0)
+			continue;
+
+		if (flags & IORESOURCE_MEM) {
+			void *addr;
+
+			if (m >= MAX_UIO_MAPS)
+				continue;
+
+			addr = ioremap(start, len);
+			if (addr == NULL) {
+				err = -EINVAL;
+				goto fail;
+			}
+
+			info->mem[m].name = bar_names[i];
+			info->mem[m].addr = start;
+			info->mem[m].internal_addr = addr;
+			info->mem[m].size = len;
+			info->mem[m].memtype = UIO_MEM_PHYS;
+			++m;
+		} else if (flags & IORESOURCE_IO) {
+			if (p >= MAX_UIO_PORT_REGIONS)
+				continue;
+
+			info->port[p].name = bar_names[i];
+			info->port[p].start = start;
+			info->port[p].size = len;
+			info->port[p].porttype = UIO_PORT_X86;
+			++p;
+		}
+	}
+
+	return 0;
+ fail:
+	for (i = 0; i < m; i++)
+		iounmap(info->mem[i].internal_addr);
+	return err;
+}
+
+static int uio_msi_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+{
+	struct uio_msi_pci_dev *udev;
+	int i, err, vectors;
+
+	udev = kzalloc(sizeof(struct uio_msi_pci_dev), GFP_KERNEL);
+	if (!udev)
+		return -ENOMEM;
+
+	err = pci_enable_device(pdev);
+	if (err != 0) {
+		dev_err(&pdev->dev, "cannot enable PCI device\n");
+		goto fail_free;
+	}
+
+	vectors = pci_msix_vec_count(pdev);
+	if (vectors < 0) {
+		dev_err(&pdev->dev, "device does not support MSI-X\n");
+		err = -EINVAL;
+		goto fail_disable;
+	}
+
+	udev->num_vectors = min_t(u16, vectors, max_vectors);
+	udev->msix = kcalloc(GFP_KERNEL, sizeof(struct msix_entry),
+			     udev->num_vectors);
+	err = -ENOMEM;
+	if (!udev->msix)
+		goto fail_disable;
+
+	udev->ctx = kcalloc(GFP_KERNEL, sizeof(struct uio_msi_irq_ctx),
+			    udev->num_vectors);
+	if (!udev->ctx)
+		goto fail_free_msix;
+
+	for (i = 0; i < udev->num_vectors; i++) {
+		udev->msix[i].entry = i;
+
+		udev->ctx[i].name = kasprintf(GFP_KERNEL,
+					      KBUILD_MODNAME "[%d](%s)",
+					      i, pci_name(pdev));
+		if (!udev->ctx[i].name)
+			goto fail_free_ctx;
+	}
+
+	err = pci_request_regions(pdev, "uio_msi");
+	if (err != 0) {
+		dev_err(&pdev->dev, "Cannot request regions\n");
+		goto fail_free_ctx;
+	}
+
+	pci_set_master(pdev);
+
+	/* remap resources */
+	err = setup_maps(pdev, &udev->info);
+	if (err)
+		goto fail_release_iomem;
+
+	/* fill uio infos */
+	udev->info.name = "uio_msi";
+	udev->info.version = DRIVER_VERSION;
+	udev->info.priv = udev;
+	udev->pdev = pdev;
+	udev->info.ioctl = uio_msi_ioctl;
+	udev->info.open = uio_msi_open;
+	udev->info.release = uio_msi_release;
+	udev->info.irq = UIO_IRQ_CUSTOM;
+	mutex_init(&udev->mutex);
+
+	/* register uio driver */
+	err = uio_register_device(&pdev->dev, &udev->info);
+	if (err != 0)
+		goto fail_release_iomem;
+
+	pci_set_drvdata(pdev, udev);
+	return 0;
+
+fail_release_iomem:
+	release_iomaps(udev->info.mem);
+	pci_release_regions(pdev);
+fail_free_ctx:
+	for (i = 0; i < udev->num_vectors; i++)
+		kfree(udev->ctx[i].name);
+	kfree(udev->ctx);
+fail_free_msix:
+	kfree(udev->msix);
+fail_disable:
+	pci_disable_device(pdev);
+fail_free:
+	kfree(udev);
+
+	return err;
+}
+
+static void uio_msi_remove(struct pci_dev *pdev)
+{
+	struct uio_info *info = pci_get_drvdata(pdev);
+	struct uio_msi_pci_dev *udev
+		= container_of(info, struct uio_msi_pci_dev, info);
+	int i;
+
+	uio_unregister_device(info);
+	release_iomaps(info->mem);
+
+	pci_release_regions(pdev);
+	for (i = 0; i < udev->num_vectors; i++)
+		kfree(udev->ctx[i].name);
+	kfree(udev->ctx);
+	kfree(udev->msix);
+	pci_disable_device(pdev);
+
+	pci_set_drvdata(pdev, NULL);
+	kfree(info);
+}
+
+static struct pci_driver uio_msi_pci_driver = {
+	.name = "uio_msi",
+	.probe = uio_msi_probe,
+	.remove = uio_msi_remove,
+};
+
+module_pci_driver(uio_msi_pci_driver);
+MODULE_VERSION(DRIVER_VERSION);
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Stephen Hemminger <stephen@networkplumber.org>");
+MODULE_DESCRIPTION("UIO driver for MSI-X PCI devices");
diff --git a/lib/librte_eal/linuxapp/uio_msi/uio_msi.h b/lib/librte_eal/linuxapp/uio_msi/uio_msi.h
new file mode 100644
index 0000000..297de00
--- /dev/null
+++ b/lib/librte_eal/linuxapp/uio_msi/uio_msi.h
@@ -0,0 +1,22 @@
+/*
+ * UIO_MSI API definition
+ *
+ * Copyright (c) 2015 by Brocade Communications Systems, Inc.
+ * All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#ifndef _UIO_PCI_MSI_H
+#define _UIO_PCI_MSI_H
+
+struct uio_msi_irq_set {
+	u32 vec;
+	int fd;
+};
+
+#define UIO_MSI_BASE	0x86
+#define UIO_MSI_IRQ_SET	_IOW('I', UIO_MSI_BASE+1, struct uio_msi_irq_set)
+
+#endif
-- 
2.1.4

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [dpdk-dev] [PATCH 5/5] uio: integrate MSI-X support
  2015-05-18 17:40 [dpdk-dev] [PATCH 0/5] receive IRQ related patches Stephen Hemminger
                   ` (3 preceding siblings ...)
  2015-05-18 17:40 ` [dpdk-dev] [PATCH 4/5] uio: new driver with MSI-X support Stephen Hemminger
@ 2015-05-18 17:40 ` Stephen Hemminger
  2015-05-25  8:55   ` Liang, Cunming
  2015-07-09  0:28 ` [dpdk-dev] [PATCH 0/5] receive IRQ related patches Thomas Monjalon
  5 siblings, 1 reply; 11+ messages in thread
From: Stephen Hemminger @ 2015-05-18 17:40 UTC (permalink / raw)
  To: cumming.lian; +Cc: dev

Add the new uio_msi as a supported driver model.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 lib/librte_eal/common/include/rte_pci.h            |  1 +
 lib/librte_eal/linuxapp/eal/eal_interrupts.c       | 94 +++++++++++++++++++---
 lib/librte_eal/linuxapp/eal/eal_pci.c              |  4 +
 lib/librte_eal/linuxapp/eal/eal_pci_uio.c          | 59 ++++++++++++--
 lib/librte_eal/linuxapp/eal/eal_uio_msi.h          | 26 ++++++
 .../linuxapp/eal/include/exec-env/rte_interrupts.h |  1 +
 lib/librte_ether/rte_ethdev.c                      |  1 +
 tools/dpdk_nic_bind.py                             |  2 +-
 8 files changed, 166 insertions(+), 22 deletions(-)
 create mode 100644 lib/librte_eal/linuxapp/eal/eal_uio_msi.h

diff --git a/lib/librte_eal/common/include/rte_pci.h b/lib/librte_eal/common/include/rte_pci.h
index 223d3cd..106f4f7 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -147,6 +147,7 @@ enum rte_kernel_driver {
 	RTE_KDRV_IGB_UIO,
 	RTE_KDRV_VFIO,
 	RTE_KDRV_UIO_GENERIC,
+	RTE_KDRV_UIO_MSIX,
 };
 
 /**
diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
index fd97fc4..8cdab58 100644
--- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
@@ -66,6 +66,7 @@
 
 #include "eal_private.h"
 #include "eal_vfio.h"
+#include "eal_uio_msi.h"
 
 #define EAL_INTR_EPOLL_WAIT_FOREVER (-1)
 
@@ -89,9 +90,7 @@ union intr_pipefds{
  */
 union rte_intr_read_buffer {
 	int uio_intr_count;              /* for uio device */
-#ifdef VFIO_PRESENT
-	uint64_t vfio_intr_count;        /* for vfio device */
-#endif
+	uint64_t eventfd_count;		 /* for vfio and uio-msi */
 	uint64_t timerfd_num;            /* for timerfd */
 	char charbuf[16];                /* for others */
 };
@@ -356,6 +355,67 @@ vfio_disable_msix(struct rte_intr_handle *intr_handle) {
 }
 #endif
 
+/* enable MSI-X interrupts */
+static int
+uio_msix_enable(struct rte_intr_handle *intr_handle)
+{
+	int i, max_intr;
+
+	if (!intr_handle->max_intr ||
+	    intr_handle->max_intr > RTE_MAX_RXTX_INTR_VEC_ID)
+		max_intr = RTE_MAX_RXTX_INTR_VEC_ID + 1;
+	else
+		max_intr = intr_handle->max_intr;
+
+	/* Actual number of MSI-X interrupts might be less than requested */
+	for (i = 0; i < max_intr; i++) {
+		struct uio_msi_irq_set irqs = {
+			.vec = i,
+			.fd = intr_handle->efds[i],
+		};
+
+		if (i == max_intr - 1)
+			irqs.fd = intr_handle->fd;
+
+		if (ioctl(intr_handle->vfio_dev_fd, UIO_MSI_IRQ_SET, &irqs) < 0) {
+			RTE_LOG(ERR, EAL,
+				"Error enabling MSI-X event %u fd %d (%s)\n",
+				irqs.vec, irqs.fd, strerror(errno));
+			return -1;
+		}
+	}
+
+	return 0;
+}
+
+/* disable MSI-X interrupts */
+static int
+uio_msix_disable(struct rte_intr_handle *intr_handle)
+{
+	int i, max_intr;
+
+	if (!intr_handle->max_intr ||
+	    intr_handle->max_intr > RTE_MAX_RXTX_INTR_VEC_ID)
+		max_intr = RTE_MAX_RXTX_INTR_VEC_ID + 1;
+	else
+		max_intr = intr_handle->max_intr;
+
+	for (i = 0; i < max_intr; i++) {
+		struct uio_msi_irq_set irqs = {
+			.vec = i,
+			.fd = -1,
+		};
+
+		if (ioctl(intr_handle->vfio_dev_fd, UIO_MSI_IRQ_SET, &irqs) < 0) {
+			RTE_LOG(ERR, EAL,
+				"Error disabling MSI-X event %u (%s)\n",
+				i, strerror(errno));
+			return -1;
+		}
+	}
+	return 0;
+}
+
 static int
 uio_intx_intr_disable(struct rte_intr_handle *intr_handle)
 {
@@ -584,6 +644,10 @@ rte_intr_enable(struct rte_intr_handle *intr_handle)
 		if (uio_intx_intr_enable(intr_handle))
 			return -1;
 		break;
+	case RTE_INTR_HANDLE_UIO_MSIX:
+		if (uio_msix_enable(intr_handle))
+			return -1;
+		break;
 	/* not used at this moment */
 	case RTE_INTR_HANDLE_ALARM:
 		return -1;
@@ -628,6 +692,10 @@ rte_intr_disable(struct rte_intr_handle *intr_handle)
 		if (uio_intx_intr_disable(intr_handle))
 			return -1;
 		break;
+	case RTE_INTR_HANDLE_UIO_MSIX:
+		if (uio_msix_disable(intr_handle))
+			return -1;
+		break;
 	/* not used at this moment */
 	case RTE_INTR_HANDLE_ALARM:
 		return -1;
@@ -696,16 +764,19 @@ eal_intr_process_interrupts(struct epoll_event *events, int nfds)
 		case RTE_INTR_HANDLE_UIO:
 			bytes_read = sizeof(buf.uio_intr_count);
 			break;
+
 		case RTE_INTR_HANDLE_ALARM:
 			bytes_read = sizeof(buf.timerfd_num);
 			break;
-#ifdef VFIO_PRESENT
+
+		case RTE_INTR_HANDLE_UIO_MSIX:
+#ifdef RTE_EAL_VFIO
 		case RTE_INTR_HANDLE_VFIO_MSIX:
 		case RTE_INTR_HANDLE_VFIO_MSI:
 		case RTE_INTR_HANDLE_VFIO_LEGACY:
-			bytes_read = sizeof(buf.vfio_intr_count);
-			break;
 #endif
+			bytes_read = sizeof(buf.eventfd_count);
+			break;
 		default:
 			bytes_read = 1;
 			break;
@@ -895,17 +966,14 @@ static void
 eal_intr_proc_rxtx_intr(int fd, struct rte_intr_handle *intr_handle)
 {
 	union rte_intr_read_buffer buf;
-	int bytes_read = 1;
+	int bytes_read = sizeof(buf.eventfd_count);
 
-	if (intr_handle->type != RTE_INTR_HANDLE_VFIO_MSIX) {
-		RTE_LOG(ERR, EAL, "intr type should be VFIO_MSIX\n");
+	if (intr_handle->type != RTE_INTR_HANDLE_VFIO_MSIX &&
+	    intr_handle->type != RTE_INTR_HANDLE_UIO_MSIX) {
+		RTE_LOG(ERR, EAL, "intr type should be VFIO_MSIX or UIO_MSIX\n");
 		return;
 	}
 
-#ifdef VFIO_PRESENT
-	bytes_read = sizeof(buf.vfio_intr_count);
-#endif
-
 	/**
 	 * read out to clear the ready-to-be-read flag
 	 * for epoll_wait.
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c b/lib/librte_eal/linuxapp/eal/eal_pci.c
index d2adc66..814dc7c 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -345,6 +345,8 @@ pci_scan_one(const char *dirname, uint16_t domain, uint8_t bus,
 			dev->kdrv = RTE_KDRV_IGB_UIO;
 		else if (!strcmp(driver, "uio_pci_generic"))
 			dev->kdrv = RTE_KDRV_UIO_GENERIC;
+		else if (!strcmp(driver, "uio_msi"))
+			dev->kdrv = RTE_KDRV_UIO_MSIX;
 		else
 			dev->kdrv = RTE_KDRV_UNKNOWN;
 	} else if (ret < 0) {
@@ -576,6 +578,7 @@ pci_map_device(struct rte_pci_device *dev)
 			ret = pci_vfio_map_resource(dev);
 #endif
 		break;
+	case RTE_KDRV_UIO_MSIX:
 	case RTE_KDRV_IGB_UIO:
 	case RTE_KDRV_UIO_GENERIC:
 		/* map resources for devices that use uio */
@@ -603,6 +606,7 @@ pci_unmap_device(struct rte_pci_device *dev)
 	case RTE_KDRV_VFIO:
 		RTE_LOG(ERR, EAL, "Hotplug doesn't support vfio yet\n");
 		break;
+	case RTE_KDRV_UIO_MSIX:
 	case RTE_KDRV_IGB_UIO:
 	case RTE_KDRV_UIO_GENERIC:
 		/* unmap resources for devices that use uio */
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
index b5116a7..7eee828 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
@@ -38,6 +38,7 @@
 #include <sys/stat.h>
 #include <sys/mman.h>
 #include <linux/pci_regs.h>
+#include <sys/eventfd.h>
 
 #include <rte_log.h>
 #include <rte_pci.h>
@@ -259,13 +260,42 @@ pci_get_uio_dev(struct rte_pci_device *dev, char *dstbuf,
 	return uio_num;
 }
 
+static int
+pci_uio_msix_init(struct rte_pci_device *dev)
+{
+	int i, fd;
+
+	/* set up an eventfd for interrupts */
+	fd = eventfd(0, EFD_NONBLOCK | EFD_CLOEXEC);
+	if (fd < 0) {
+		RTE_LOG(ERR, EAL, "  cannot set up irq eventfd (%s)\n",
+			strerror(errno));
+		return -1;
+	}
+	dev->intr_handle.fd = fd;
+
+	/* an additional eventfd for each vector */
+	for (i = 0; i < RTE_MAX_RXTX_INTR_VEC_ID; i++) {
+		fd = eventfd(0, EFD_NONBLOCK|EFD_CLOEXEC);
+		if (fd < 0) {
+			RTE_LOG(ERR, EAL,
+				" cannot set up eventfd (%s)\n",
+				strerror(errno));
+			return -1;
+		}
+
+		dev->intr_handle.efds[i] = fd;
+	}
+
+	return 0;
+}
+
 /* map the PCI resource of a PCI device in virtual memory */
 int
 pci_uio_map_resource(struct rte_pci_device *dev)
 {
-	int i, map_idx;
+	int i, fd, map_idx;
 	char dirname[PATH_MAX];
-	char cfgname[PATH_MAX];
 	char devname[PATH_MAX]; /* contains the /dev/uioX */
 	void *mapaddr;
 	int uio_num;
@@ -274,11 +304,15 @@ pci_uio_map_resource(struct rte_pci_device *dev)
 	struct mapped_pci_resource *uio_res;
 	struct mapped_pci_res_list *uio_res_list = RTE_TAILQ_CAST(rte_uio_tailq.head, mapped_pci_res_list);
 	struct pci_map *maps;
+	char cfgname[PATH_MAX];
 
 	dev->intr_handle.fd = -1;
-	dev->intr_handle.uio_cfg_fd = -1;
+	dev->intr_handle.vfio_dev_fd = -1;
 	dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN;
 
+	for (i = 0; i < RTE_MAX_RXTX_INTR_VEC_ID; i++)
+		dev->intr_handle.efds[i] = -1;
+
 	/* secondary processes - use already recorded details */
 	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
 		return pci_uio_map_secondary(dev);
@@ -293,15 +327,15 @@ pci_uio_map_resource(struct rte_pci_device *dev)
 	snprintf(devname, sizeof(devname), "/dev/uio%u", uio_num);
 
 	/* save fd if in primary process */
-	dev->intr_handle.fd = open(devname, O_RDWR);
-	if (dev->intr_handle.fd < 0) {
+	fd = open(devname, O_RDWR);
+	if (fd < 0) {
 		RTE_LOG(ERR, EAL, "Cannot open %s: %s\n",
 			devname, strerror(errno));
 		return -1;
 	}
 
 	snprintf(cfgname, sizeof(cfgname),
-			"/sys/class/uio/uio%u/device/config", uio_num);
+		 "/sys/class/uio/uio%u/device/config", uio_num);
 	dev->intr_handle.uio_cfg_fd = open(cfgname, O_RDWR);
 	if (dev->intr_handle.uio_cfg_fd < 0) {
 		RTE_LOG(ERR, EAL, "Cannot open %s: %s\n",
@@ -309,9 +343,17 @@ pci_uio_map_resource(struct rte_pci_device *dev)
 		return -1;
 	}
 
-	if (dev->kdrv == RTE_KDRV_IGB_UIO)
+	if (dev->kdrv == RTE_KDRV_UIO_MSIX) {
+		dev->intr_handle.vfio_dev_fd = fd;
+		dev->intr_handle.type = RTE_INTR_HANDLE_UIO_MSIX;
+		if (pci_uio_msix_init(dev) < 0)
+			return -1;
+	} else if (dev->kdrv == RTE_KDRV_IGB_UIO) {
+		dev->intr_handle.fd = fd;
 		dev->intr_handle.type = RTE_INTR_HANDLE_UIO;
-	else {
+	} else {
+
+		dev->intr_handle.fd = fd;
 		dev->intr_handle.type = RTE_INTR_HANDLE_UIO_INTX;
 
 		/* set bus master that is not done by uio_pci_generic */
@@ -460,6 +502,7 @@ pci_uio_unmap_resource(struct rte_pci_device *dev)
 
 	/* close fd if in primary process */
 	close(dev->intr_handle.fd);
+	close(dev->intr_handle.uio_cfg_fd);
 
 	dev->intr_handle.fd = -1;
 	dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN;
diff --git a/lib/librte_eal/linuxapp/eal/eal_uio_msi.h b/lib/librte_eal/linuxapp/eal/eal_uio_msi.h
new file mode 100644
index 0000000..f01f302
--- /dev/null
+++ b/lib/librte_eal/linuxapp/eal/eal_uio_msi.h
@@ -0,0 +1,26 @@
+/*
+ * UIO_MSI API definition
+ *
+ * Copyright (c) 2015 by Brocade Communications Systems, Inc.
+ * All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#ifndef EAL_UIO_MSI_H
+#define EAL_UIO_MSI_H
+
+/* Driver is not upstream yet. */
+
+#include <sys/ioctl.h>
+
+struct uio_msi_irq_set {
+	uint32_t vec;
+	int fd;
+};
+
+#define UIO_MSI_BASE	0x86
+#define UIO_MSI_IRQ_SET	_IOW('I', UIO_MSI_BASE+1, struct uio_msi_irq_set)
+
+#endif
diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
index 9843001..d3cf680 100644
--- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
@@ -44,6 +44,7 @@ enum rte_intr_handle_type {
 	RTE_INTR_HANDLE_UNKNOWN = 0,
 	RTE_INTR_HANDLE_UIO,          /**< uio device handle */
 	RTE_INTR_HANDLE_UIO_INTX,     /**< uio generic handle */
+	RTE_INTR_HANDLE_UIO_MSIX,     /**< uio with MSI-X support */
 	RTE_INTR_HANDLE_VFIO_LEGACY,  /**< vfio device handle (legacy) */
 	RTE_INTR_HANDLE_VFIO_MSI,     /**< vfio device handle (MSI) */
 	RTE_INTR_HANDLE_VFIO_MSIX,    /**< vfio device handle (MSIX) */
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index cf9a79a..3fbc4a1 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -515,6 +515,7 @@ rte_eth_dev_is_detachable(uint8_t port_id)
 		switch (rte_eth_devices[port_id].pci_dev->kdrv) {
 		case RTE_KDRV_IGB_UIO:
 		case RTE_KDRV_UIO_GENERIC:
+		case RTE_KDRV_UIO_MSIX:
 			break;
 		case RTE_KDRV_VFIO:
 		default:
diff --git a/tools/dpdk_nic_bind.py b/tools/dpdk_nic_bind.py
index 8523f82..20b4b06 100755
--- a/tools/dpdk_nic_bind.py
+++ b/tools/dpdk_nic_bind.py
@@ -43,7 +43,7 @@ ETHERNET_CLASS = "0200"
 # Each device within this is itself a dictionary of device properties
 devices = {}
 # list of supported DPDK drivers
-dpdk_drivers = [ "igb_uio", "vfio-pci", "uio_pci_generic" ]
+dpdk_drivers = [ "igb_uio", "vfio-pci", "uio_pci_generic", "uio_msi" ]
 
 # command-line arg flags
 b_flag = None
-- 
2.1.4

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [dpdk-dev] [PATCH 4/5] uio: new driver with MSI-X support
  2015-05-18 17:40 ` [dpdk-dev] [PATCH 4/5] uio: new driver with MSI-X support Stephen Hemminger
@ 2015-05-25  6:01   ` Liang, Cunming
  2015-05-25 17:41     ` Stephen Hemminger
  0 siblings, 1 reply; 11+ messages in thread
From: Liang, Cunming @ 2015-05-25  6:01 UTC (permalink / raw)
  To: Stephen Hemminger, cumming.lian; +Cc: dev



On 5/19/2015 1:40 AM, Stephen Hemminger wrote:
> +
> +/* set the mapping between vector # and existing eventfd. */
> +static int set_irq_eventfd(struct uio_msi_pci_dev *udev, u32 vec, int fd)
> +{
> +	struct uio_msi_irq_ctx *ctx;
> +	struct eventfd_ctx *trigger;
> +	int irq, err;
> +
> +	if (vec >= udev->num_vectors) {
> +		dev_notice(&udev->pdev->dev, "vec %u >= num_vec %u\n",
> +			   vec, udev->num_vectors);
> +		return -ERANGE;
> +	}
> +
> +	irq = udev->msix[vec].vector;
> +
> +	/* Clearup existing irq mapping */
> +	ctx = &udev->ctx[vec];
> +	if (ctx->trigger) {
> +		free_irq(irq, ctx->trigger);
> +		eventfd_ctx_put(ctx->trigger);
> +		ctx->trigger = NULL;
> +	}
> +
> +	/* Passing -1 is used to disable interrupt */
> +	if (fd < 0)
> +		return 0;
> +
> +
One unnecessary blank line here.
> +	trigger = eventfd_ctx_fdget(fd);
> +	if (IS_ERR(trigger)) {
> +		err = PTR_ERR(trigger);
> +		dev_notice(&udev->pdev->dev,
> +			   "eventfd ctx get failed: %d\n", err);
> +		return err;
> +	}
> +
> +	err = request_irq(irq, uio_msi_irqhandler, 0, ctx->name, trigger);
> +	if (err) {
> +		dev_notice(&udev->pdev->dev,
> +			   "request irq failed: %d\n", err);
> +		eventfd_ctx_put(trigger);
> +		return err;
> +	}
> +
> +	dev_dbg(&udev->pdev->dev, "map vector %u to fd %d trigger %p\n",
> +		  vec, fd, trigger);
> +	ctx->trigger = trigger;
> +	return 0;
> +}
> +
> +static int
> +uio_msi_ioctl(struct uio_info *info, unsigned int cmd, unsigned long arg)
> +{
> +	struct uio_msi_pci_dev *udev
> +		= container_of(info, struct uio_msi_pci_dev, info);
> +	struct uio_msi_irq_set hdr;
> +	int err;
> +
> +	switch (cmd) {
> +	case UIO_MSI_IRQ_SET:
> +		if (copy_from_user(&hdr, (void __user *)arg, sizeof(hdr)))
> +			return -EFAULT;
> +
> +		mutex_lock(&udev->mutex);
> +		err = set_irq_eventfd(udev, hdr.vec, hdr.fd);
> +		mutex_unlock(&udev->mutex);
> +		break;
> +	default:
> +		err = -EOPNOTSUPP;
> +	}
> +	return err;
> +}
"uio_msi_irq_set" defines in single pattern. Compare with the bulk set 
in "vfio_irq_set", it requires additional syscall during uio_msix_enable().
> +
> +static int uio_msi_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> +{
> +	struct uio_msi_pci_dev *udev;
> +	int i, err, vectors;
> +
> +	udev = kzalloc(sizeof(struct uio_msi_pci_dev), GFP_KERNEL);
> +	if (!udev)
> +		return -ENOMEM;
> +
> +	err = pci_enable_device(pdev);
> +	if (err != 0) {
> +		dev_err(&pdev->dev, "cannot enable PCI device\n");
> +		goto fail_free;
> +	}
> +
> +	vectors = pci_msix_vec_count(pdev);
> +	if (vectors < 0) {
> +		dev_err(&pdev->dev, "device does not support MSI-X\n");
> +		err = -EINVAL;
> +		goto fail_disable;
> +	}
pci_msix_vec_count() is available since v3.14,  it requires a compatible 
check.
In order to support older version, probably a function 
'uio_msix_vec_count()' is necessary.

I've one overall question, is there a special reason not enhance igb_uio 
but define a new uio_msi?  And the looks like the piece could be add 
into igb_uio or uio_pci_generic as well. Do you have plan for the 
latter? Thanks.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [dpdk-dev] [PATCH 5/5] uio: integrate MSI-X support
  2015-05-18 17:40 ` [dpdk-dev] [PATCH 5/5] uio: integrate " Stephen Hemminger
@ 2015-05-25  8:55   ` Liang, Cunming
  2015-05-25 17:42     ` Stephen Hemminger
  0 siblings, 1 reply; 11+ messages in thread
From: Liang, Cunming @ 2015-05-25  8:55 UTC (permalink / raw)
  To: Stephen Hemminger, cumming.lian; +Cc: dev



On 5/19/2015 1:40 AM, Stephen Hemminger wrote:
> +/* enable MSI-X interrupts */
> +static int
> +uio_msix_enable(struct rte_intr_handle *intr_handle)
> +{
> +	int i, max_intr;
> +
> +	if (!intr_handle->max_intr ||
> +	    intr_handle->max_intr > RTE_MAX_RXTX_INTR_VEC_ID)
> +		max_intr = RTE_MAX_RXTX_INTR_VEC_ID + 1;
> +	else
> +		max_intr = intr_handle->max_intr;
> +
> +	/* Actual number of MSI-X interrupts might be less than requested */
> +	for (i = 0; i < max_intr; i++) {
> +		struct uio_msi_irq_set irqs = {
> +			.vec = i,
> +			.fd = intr_handle->efds[i],
> +		};
> +
> +		if (i == max_intr - 1)
> +			irqs.fd = intr_handle->fd;
> +
> +		if (ioctl(intr_handle->vfio_dev_fd, UIO_MSI_IRQ_SET, &irqs) < 0) {
It would be strange if using vfio_dev_fd in 'uio_msix_' related function.
> +			RTE_LOG(ERR, EAL,
> +				"Error enabling MSI-X event %u fd %d (%s)\n",
> +				irqs.vec, irqs.fd, strerror(errno));
> +			return -1;
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +
>   /* map the PCI resource of a PCI device in virtual memory */
>   int
>   pci_uio_map_resource(struct rte_pci_device *dev)
>   {
> -	int i, map_idx;
> +	int i, fd, map_idx;
>   	char dirname[PATH_MAX];
> -	char cfgname[PATH_MAX];
>   	char devname[PATH_MAX]; /* contains the /dev/uioX */
>   	void *mapaddr;
>   	int uio_num;
> @@ -274,11 +304,15 @@ pci_uio_map_resource(struct rte_pci_device *dev)
>   	struct mapped_pci_resource *uio_res;
>   	struct mapped_pci_res_list *uio_res_list = RTE_TAILQ_CAST(rte_uio_tailq.head, mapped_pci_res_list);
>   	struct pci_map *maps;
> +	char cfgname[PATH_MAX];
>   
>   	dev->intr_handle.fd = -1;
> -	dev->intr_handle.uio_cfg_fd = -1;
> +	dev->intr_handle.vfio_dev_fd = -1;
>   	dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN;
>   
> +	for (i = 0; i < RTE_MAX_RXTX_INTR_VEC_ID; i++)
> +		dev->intr_handle.efds[i] = -1;
> +
>   	/* secondary processes - use already recorded details */
>   	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
>   		return pci_uio_map_secondary(dev);
> @@ -293,15 +327,15 @@ pci_uio_map_resource(struct rte_pci_device *dev)
>   	snprintf(devname, sizeof(devname), "/dev/uio%u", uio_num);
>   
>   	/* save fd if in primary process */
> -	dev->intr_handle.fd = open(devname, O_RDWR);
> -	if (dev->intr_handle.fd < 0) {
> +	fd = open(devname, O_RDWR);
> +	if (fd < 0) {
>   		RTE_LOG(ERR, EAL, "Cannot open %s: %s\n",
>   			devname, strerror(errno));
>   		return -1;
>   	}
>   
>   	snprintf(cfgname, sizeof(cfgname),
> -			"/sys/class/uio/uio%u/device/config", uio_num);
> +		 "/sys/class/uio/uio%u/device/config", uio_num);
>   	dev->intr_handle.uio_cfg_fd = open(cfgname, O_RDWR);
>   	if (dev->intr_handle.uio_cfg_fd < 0) {
>   		RTE_LOG(ERR, EAL, "Cannot open %s: %s\n",
> @@ -309,9 +343,17 @@ pci_uio_map_resource(struct rte_pci_device *dev)
>   		return -1;
>   	}
>   
> -	if (dev->kdrv == RTE_KDRV_IGB_UIO)
> +	if (dev->kdrv == RTE_KDRV_UIO_MSIX) {
> +		dev->intr_handle.vfio_dev_fd = fd;
I understand now uio_cfg_fd is used for uio configure and intx, so one 
additional event fd required by uio_msi for msi/msix other cause intr.
What about store it in epfd[max_intr - 1] or define a new 'uio_msi_efd' 
so as to avoid using vfio_dev_fd.
As uio_msi defines only to support msi/msix mode, while igb_uio support 
either intx or msix(one intr vector).
It makes confusing for user to choose which to insmod.
The ideal way probably be one uio kernel module support all mode, by new 
added ioctl it can configure which intr mode {intx, msi, msix}(by new 
eal option --uio-intr or merge with --vfio-intr to an unify --intr-mode) 
user app expect to use.

> +		dev->intr_handle.type = RTE_INTR_HANDLE_UIO_MSIX;
> +		if (pci_uio_msix_init(dev) < 0)
> +			return -1;
> +	} else if (dev->kdrv == RTE_KDRV_IGB_UIO) {
> +		dev->intr_handle.fd = fd;
>   		dev->intr_handle.type = RTE_INTR_HANDLE_UIO;
> -	else {
> +	} else {
> +
> +		dev->intr_handle.fd = fd;
>   		dev->intr_handle.type = RTE_INTR_HANDLE_UIO_INTX;
>   
>   		/* set bus master that is not done by uio_pci_generic */
> @@ -460,6 +502,7 @@ pci_uio_unmap_resource(struct rte_pci_device *dev)
>   
>   	/* close fd if in primary process */
>   	close(dev->intr_handle.fd);
> +	close(dev->intr_handle.uio_cfg_fd);
>   
>   	dev->intr_handle.fd = -1;
>   	dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN;
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [dpdk-dev] [PATCH 4/5] uio: new driver with MSI-X support
  2015-05-25  6:01   ` Liang, Cunming
@ 2015-05-25 17:41     ` Stephen Hemminger
  0 siblings, 0 replies; 11+ messages in thread
From: Stephen Hemminger @ 2015-05-25 17:41 UTC (permalink / raw)
  To: Liang, Cunming; +Cc: dev, cumming.lian

On Mon, 25 May 2015 14:01:14 +0800
"Liang, Cunming" <cunming.liang@intel.com> wrote:

> 
> 
> On 5/19/2015 1:40 AM, Stephen Hemminger wrote:
> > +
> > +/* set the mapping between vector # and existing eventfd. */
> > +static int set_irq_eventfd(struct uio_msi_pci_dev *udev, u32 vec, int fd)
> > +{
> > +	struct uio_msi_irq_ctx *ctx;
> > +	struct eventfd_ctx *trigger;
> > +	int irq, err;
> > +
> > +	if (vec >= udev->num_vectors) {
> > +		dev_notice(&udev->pdev->dev, "vec %u >= num_vec %u\n",
> > +			   vec, udev->num_vectors);
> > +		return -ERANGE;
> > +	}
> > +
> > +	irq = udev->msix[vec].vector;
> > +
> > +	/* Clearup existing irq mapping */
> > +	ctx = &udev->ctx[vec];
> > +	if (ctx->trigger) {
> > +		free_irq(irq, ctx->trigger);
> > +		eventfd_ctx_put(ctx->trigger);
> > +		ctx->trigger = NULL;
> > +	}
> > +
> > +	/* Passing -1 is used to disable interrupt */
> > +	if (fd < 0)
> > +		return 0;
> > +
> > +
> One unnecessary blank line here.
> > +	trigger = eventfd_ctx_fdget(fd);
> > +	if (IS_ERR(trigger)) {
> > +		err = PTR_ERR(trigger);
> > +		dev_notice(&udev->pdev->dev,
> > +			   "eventfd ctx get failed: %d\n", err);
> > +		return err;
> > +	}
> > +
> > +	err = request_irq(irq, uio_msi_irqhandler, 0, ctx->name, trigger);
> > +	if (err) {
> > +		dev_notice(&udev->pdev->dev,
> > +			   "request irq failed: %d\n", err);
> > +		eventfd_ctx_put(trigger);
> > +		return err;
> > +	}
> > +
> > +	dev_dbg(&udev->pdev->dev, "map vector %u to fd %d trigger %p\n",
> > +		  vec, fd, trigger);
> > +	ctx->trigger = trigger;
> > +	return 0;
> > +}
> > +
> > +static int
> > +uio_msi_ioctl(struct uio_info *info, unsigned int cmd, unsigned long arg)
> > +{
> > +	struct uio_msi_pci_dev *udev
> > +		= container_of(info, struct uio_msi_pci_dev, info);
> > +	struct uio_msi_irq_set hdr;
> > +	int err;
> > +
> > +	switch (cmd) {
> > +	case UIO_MSI_IRQ_SET:
> > +		if (copy_from_user(&hdr, (void __user *)arg, sizeof(hdr)))
> > +			return -EFAULT;
> > +
> > +		mutex_lock(&udev->mutex);
> > +		err = set_irq_eventfd(udev, hdr.vec, hdr.fd);
> > +		mutex_unlock(&udev->mutex);
> > +		break;
> > +	default:
> > +		err = -EOPNOTSUPP;
> > +	}
> > +	return err;
> > +}
> "uio_msi_irq_set" defines in single pattern. Compare with the bulk set 
> in "vfio_irq_set", it requires additional syscall during uio_msix_enable().

The bulk operation is VFIO is actually a bad design,
It forces too many updates when manipulating individual vectors.
Personally, the whole VFIO API has some questionable design choices
that I did not want to repeat.

> > +static int uio_msi_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> > +{
> > +	struct uio_msi_pci_dev *udev;
> > +	int i, err, vectors;
> > +
> > +	udev = kzalloc(sizeof(struct uio_msi_pci_dev), GFP_KERNEL);
> > +	if (!udev)
> > +		return -ENOMEM;
> > +
> > +	err = pci_enable_device(pdev);
> > +	if (err != 0) {
> > +		dev_err(&pdev->dev, "cannot enable PCI device\n");
> > +		goto fail_free;
> > +	}
> > +
> > +	vectors = pci_msix_vec_count(pdev);
> > +	if (vectors < 0) {
> > +		dev_err(&pdev->dev, "device does not support MSI-X\n");
> > +		err = -EINVAL;
> > +		goto fail_disable;
> > +	}
> pci_msix_vec_count() is available since v3.14,  it requires a compatible 
> check.
> In order to support older version, probably a function 
> 'uio_msix_vec_count()' is necessary.
> 
> I've one overall question, is there a special reason not enhance igb_uio 
> but define a new uio_msi?  And the looks like the piece could be add 
> into igb_uio or uio_pci_generic as well. Do you have plan for the 
> latter? Thanks.

I wanted something that could go upstream. igb_uio has some other things
which make it unlikely to get accepted in current form, so seemed best to start
fresh.

Also, intentionally did not want to address any kernel version earlier
than the version where VFIO was added.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [dpdk-dev] [PATCH 5/5] uio: integrate MSI-X support
  2015-05-25  8:55   ` Liang, Cunming
@ 2015-05-25 17:42     ` Stephen Hemminger
  0 siblings, 0 replies; 11+ messages in thread
From: Stephen Hemminger @ 2015-05-25 17:42 UTC (permalink / raw)
  To: Liang, Cunming; +Cc: dev, cumming.lian

On Mon, 25 May 2015 16:55:23 +0800
"Liang, Cunming" <cunming.liang@intel.com> wrote:

> > +			irqs.fd = intr_handle->fd;
> > +
> > +		if (ioctl(intr_handle->vfio_dev_fd, UIO_MSI_IRQ_SET, &irqs) < 0) {  
> It would be strange if using vfio_dev_fd in 'uio_msix_' related function.

Just minor variable name. The whole set of fd's etc, probably need better names..

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [dpdk-dev] [PATCH 0/5] receive IRQ related patches
  2015-05-18 17:40 [dpdk-dev] [PATCH 0/5] receive IRQ related patches Stephen Hemminger
                   ` (4 preceding siblings ...)
  2015-05-18 17:40 ` [dpdk-dev] [PATCH 5/5] uio: integrate " Stephen Hemminger
@ 2015-07-09  0:28 ` Thomas Monjalon
  5 siblings, 0 replies; 11+ messages in thread
From: Thomas Monjalon @ 2015-07-09  0:28 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev

2015-05-18 10:40, Stephen Hemminger:
> These are some of the patches to enhance the still as not yet
> merged receive interrupt functionality.

Have you made these comments in the interrupt mode thread?

> The big piece is support of UIO-MSI interrupts which is required
> to make the virtio and vmxnet3 receive IRQ functionality work.

It seems better to wait that the kernel module is integrated upstream.
Any news?

Then the last patch to use this new module will have to be carefully reviewed.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2015-07-09  0:29 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-18 17:40 [dpdk-dev] [PATCH 0/5] receive IRQ related patches Stephen Hemminger
2015-05-18 17:40 ` [dpdk-dev] [PATCH 1/5] ethdev: check for rxq interrupt support Stephen Hemminger
2015-05-18 17:40 ` [dpdk-dev] [PATCH 2/5] ethdev: remove unnecessary checks Stephen Hemminger
2015-05-18 17:40 ` [dpdk-dev] [PATCH 3/5] ethdev: fix errors if RTE_ETHDEV_DEBUG enabled Stephen Hemminger
2015-05-18 17:40 ` [dpdk-dev] [PATCH 4/5] uio: new driver with MSI-X support Stephen Hemminger
2015-05-25  6:01   ` Liang, Cunming
2015-05-25 17:41     ` Stephen Hemminger
2015-05-18 17:40 ` [dpdk-dev] [PATCH 5/5] uio: integrate " Stephen Hemminger
2015-05-25  8:55   ` Liang, Cunming
2015-05-25 17:42     ` Stephen Hemminger
2015-07-09  0:28 ` [dpdk-dev] [PATCH 0/5] receive IRQ related patches Thomas Monjalon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).