DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH 0/7] Vyatta patches
@ 2013-05-30 17:12 Stephen Hemminger
  2013-05-30 17:12 ` [dpdk-dev] [PATCH 1/7] [PATCH 4/8] igb: workaround errata with wthresh on 82576 Stephen Hemminger
                   ` (9 more replies)
  0 siblings, 10 replies; 39+ messages in thread
From: Stephen Hemminger @ 2013-05-30 17:12 UTC (permalink / raw)
  To: dev

This is a subset of the Vyatta patches we use. They include some bug
fixes and simple changes to make life easier.

One fix not included is the NUMA cpu assignment. The original Intel code
in 1.2 incorrectly used /proc/cpuinfo to try and assign CPU's to NUMA socket.
The problem is that /proc/cpuinfo physical_id corresponds to what the BIOS
tells the kernel and is intended for messages only. For example, on our
Dell boxes the first CPU and only CPU is reported as physical_id 1!
The fix is to use sysfs instead, and Intel did incorporate my fix in the
next DPDK version, and don't want to confuse this code base by putting
in a conflicting change.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [dpdk-dev] [PATCH 1/7] [PATCH 4/8] igb: workaround errata with wthresh on 82576
  2013-05-30 17:12 [dpdk-dev] [PATCH 0/7] Vyatta patches Stephen Hemminger
@ 2013-05-30 17:12 ` Stephen Hemminger
  2013-06-05 14:22   ` Vincent JARDIN
  2013-05-30 17:12 ` [dpdk-dev] [PATCH 2/7] rte_timer: optimize for empty case Stephen Hemminger
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 39+ messages in thread
From: Stephen Hemminger @ 2013-05-30 17:12 UTC (permalink / raw)
  To: dev

[-- Attachment #1: igb-workaround-wthresh-82576.patch --]
[-- Type: text/plain, Size: 1285 bytes --]

The 82576 has known issues which require the write threshold to be set to 1.
See:
	http://download.intel.com/design/network/specupdt/82576_SPECUPDATE.pdf

If not then single packets will hang in transmit ring until more arrive.
Simple tests like ping will fail.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

---
 lib/librte_pmd_e1000/em_rxtx.c |    5 +++++
 1 file changed, 5 insertions(+)

--- a/lib/librte_pmd_igb/e1000_rxtx.c	2013-03-28 08:50:50.238413818 -0700
+++ b/lib/librte_pmd_igb/e1000_rxtx.c	2013-05-29 08:49:02.369979711 -0700
@@ -1240,6 +1240,8 @@ eth_igb_tx_queue_setup(struct rte_eth_de
 	txq->pthresh = tx_conf->tx_thresh.pthresh;
 	txq->hthresh = tx_conf->tx_thresh.hthresh;
 	txq->wthresh = tx_conf->tx_thresh.wthresh;
+	if (txq->wthresh > 0 && hw->mac.type == e1000_82576)
+		txq->wthresh = 1;
 	txq->queue_id = queue_idx;
 	txq->port_id = dev->data->port_id;
 
@@ -1384,6 +1386,9 @@ eth_igb_rx_queue_setup(struct rte_eth_de
 	rxq->pthresh = rx_conf->rx_thresh.pthresh;
 	rxq->hthresh = rx_conf->rx_thresh.hthresh;
 	rxq->wthresh = rx_conf->rx_thresh.wthresh;
+	if (rxq->wthresh > 0 && hw->mac.type == e1000_82576)
+		rxq->wthresh = 1;
+
 	rxq->rx_free_thresh = rx_conf->rx_free_thresh;
 	rxq->queue_id = queue_idx;
 	rxq->port_id = dev->data->port_id;

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [dpdk-dev] [PATCH 2/7] rte_timer: optimize for empty case
  2013-05-30 17:12 [dpdk-dev] [PATCH 0/7] Vyatta patches Stephen Hemminger
  2013-05-30 17:12 ` [dpdk-dev] [PATCH 1/7] [PATCH 4/8] igb: workaround errata with wthresh on 82576 Stephen Hemminger
@ 2013-05-30 17:12 ` Stephen Hemminger
  2013-06-05 14:25   ` Vincent JARDIN
  2013-05-30 17:12 ` [dpdk-dev] [PATCH 3/7] optimize log/panic Stephen Hemminger
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 39+ messages in thread
From: Stephen Hemminger @ 2013-05-30 17:12 UTC (permalink / raw)
  To: dev

[-- Attachment #1: timer-optimize.patch --]
[-- Type: text/plain, Size: 933 bytes --]

In many application there are no timers queued, and the call to
rte_timer_managecan be optimized in that case avoid reading HPET and
lock overhead.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

---
 lib/librte_timer/rte_timer.c |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

--- a/lib/librte_timer/rte_timer.c	2013-03-28 08:50:50.246413714 -0700
+++ b/lib/librte_timer/rte_timer.c	2013-05-29 08:49:09.305899903 -0700
@@ -404,9 +404,14 @@ void rte_timer_manage(void)
 	union rte_timer_status status;
 	struct rte_timer *tim, *tim2;
 	unsigned lcore_id = rte_lcore_id();
-	uint64_t cur_time = rte_get_hpet_cycles();
+	uint64_t cur_time;
 	int ret;
 
+	/* optimize for the case where per-cpu list is empty */
+	if (LIST_EMPTY(&priv_timer[lcore_id].pending))
+		return;
+
+	cur_time = rte_get_hpet_cycles();
 	__TIMER_STAT_ADD(manage, 1);
 
 	/* browse ordered list, add expired timers in 'expired' list */

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [dpdk-dev] [PATCH 3/7] optimize log/panic
  2013-05-30 17:12 [dpdk-dev] [PATCH 0/7] Vyatta patches Stephen Hemminger
  2013-05-30 17:12 ` [dpdk-dev] [PATCH 1/7] [PATCH 4/8] igb: workaround errata with wthresh on 82576 Stephen Hemminger
  2013-05-30 17:12 ` [dpdk-dev] [PATCH 2/7] rte_timer: optimize for empty case Stephen Hemminger
@ 2013-05-30 17:12 ` Stephen Hemminger
  2013-06-05 14:34   ` Vincent JARDIN
  2013-05-30 17:12 ` [dpdk-dev] [PATCH 4/7] eal: support different modules Stephen Hemminger
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 39+ messages in thread
From: Stephen Hemminger @ 2013-05-30 17:12 UTC (permalink / raw)
  To: dev

[-- Attachment #1: rte_panic-cold.patch --]
[-- Type: text/plain, Size: 1166 bytes --]

Both logging and calls to panic are never in the critical path.
Use the GCC attribute cold to mark these functions as cold,
which generates more optimised code.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

---
 lib/librte_eal/common/include/rte_debug.h |    1 +
 lib/librte_eal/common/include/rte_log.h   |    1 +
 2 files changed, 2 insertions(+)

--- a/lib/librte_eal/common/include/rte_debug.h	2013-05-29 08:45:38.888471864 -0700
+++ b/lib/librte_eal/common/include/rte_debug.h	2013-05-29 08:49:12.361864834 -0700
@@ -87,6 +87,7 @@ void rte_dump_registers(void);
  * documentation.
  */
 void __rte_panic(const char *funcname , const char *format, ...)
+	__attribute__((cold))
 	__attribute__((noreturn))
 	__attribute__((format(printf, 2, 3)));
 
--- a/lib/librte_eal/common/include/rte_log.h	2013-03-28 08:50:50.234413869 -0700
+++ b/lib/librte_eal/common/include/rte_log.h	2013-05-29 08:49:12.361864834 -0700
@@ -216,6 +216,7 @@ int rte_log_add_in_history(const char *b
  *   - Negative on error.
  */
 int rte_log(uint32_t level, uint32_t logtype, const char *format, ...)
+	__attribute__((cold))
 	__attribute__((format(printf, 3, 4)));
 
 /**

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [dpdk-dev] [PATCH 4/7] eal: support different modules
  2013-05-30 17:12 [dpdk-dev] [PATCH 0/7] Vyatta patches Stephen Hemminger
                   ` (2 preceding siblings ...)
  2013-05-30 17:12 ` [dpdk-dev] [PATCH 3/7] optimize log/panic Stephen Hemminger
@ 2013-05-30 17:12 ` Stephen Hemminger
  2013-06-03  8:58   ` Damien Millescamps
  2013-05-30 17:12 ` [dpdk-dev] [PATCH 5/7] pci: support multiple PCI regions per device Stephen Hemminger
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 39+ messages in thread
From: Stephen Hemminger @ 2013-05-30 17:12 UTC (permalink / raw)
  To: dev

[-- Attachment #1: module-name.patch --]
[-- Type: text/plain, Size: 4840 bytes --]

An additional change is that failure to detect module should not be fatal,
but an error like other EAL setup problems, allowing the application
to recover.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

---
 app/chkincs/test_pci.c                  |    3 ++-
 app/test/test_pci.c                     |    4 ++--
 lib/librte_eal/common/include/rte_pci.h |    5 ++---
 lib/librte_eal/linuxapp/eal/eal_pci.c   |    9 ++-------
 lib/librte_pmd_igb/e1000_ethdev.c       |    2 +-
 lib/librte_pmd_ixgbe/ixgbe_ethdev.c     |    4 ++--
 6 files changed, 11 insertions(+), 16 deletions(-)

--- a/lib/librte_eal/common/include/rte_pci.h	2013-05-29 08:45:38.000000000 -0700
+++ b/lib/librte_eal/common/include/rte_pci.h	2013-05-29 09:02:50.000000000 -0700
@@ -151,12 +151,11 @@ struct rte_pci_driver {
 	pci_devinit_t *devinit;                 /**< Device init. function. */
 	struct rte_pci_id *id_table;            /**< ID table, NULL terminated. */
 	uint32_t drv_flags;                     /**< Flags contolling handling of device. */
+	const char *module_name;		/**< Associated kernel module */
 };
 
-/** Device needs igb_uio kernel module */
-#define RTE_PCI_DRV_NEED_IGB_UIO 0x0001
 /** Device driver must be registered several times until failure */
-#define RTE_PCI_DRV_MULTIPLE 0x0002
+#define RTE_PCI_DRV_MULTIPLE 0x0001
 
 /**
  * Probe the PCI bus for registered drivers.
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c	2013-03-28 08:50:50.000000000 -0700
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c	2013-05-29 09:04:00.000000000 -0700
@@ -76,7 +76,7 @@
  * This code is used to simulate a PCI probe by parsing information in
  * sysfs. Moreover, when a registered driver matches a device, the
  * kernel driver currently using it is unloaded and replaced by
- * igb_uio module, which is a very minimal userland driver for Intel
+ * a module, which is a very minimal userland driver for Intel
  * network card, only providing access to PCI BAR to applications, and
  * enabling bus master.
  */
@@ -84,8 +84,6 @@
 
 #define PROC_MODULES "/proc/modules"
 
-#define IGB_UIO_NAME "igb_uio"
-
 #define UIO_NEWID "/sys/bus/pci/drivers/%s/new_id"
 #define UIO_BIND  "/sys/bus/pci/drivers/%s/bind"
 
@@ -700,12 +698,9 @@ int
 rte_eal_pci_probe_one_driver(struct rte_pci_driver *dr, struct rte_pci_device *dev)
 {
 	struct rte_pci_id *id_table;
-	const char *module_name = NULL;
+	const char *module_name = dr->module_name;
 	int uio_status = -1;
 
-	if (dr->drv_flags & RTE_PCI_DRV_NEED_IGB_UIO)
-		module_name = IGB_UIO_NAME;
-
 	uio_status = pci_uio_check_module(module_name);
 
 	for (id_table = dr->id_table ; id_table->vendor_id != 0; id_table++) {
--- a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c	2013-05-29 08:45:38.000000000 -0700
+++ b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c	2013-05-29 16:08:07.544000027 -0700
@@ -547,7 +547,7 @@ static struct eth_driver rte_ixgbe_pmd =
 	{
 		.name = "rte_ixgbe_pmd",
 		.id_table = pci_id_ixgbe_map,
-		.drv_flags = RTE_PCI_DRV_NEED_IGB_UIO,
+		.module_name = "igb_uio",
 	},
 	.eth_dev_init = eth_ixgbe_dev_init,
 	.dev_private_size = sizeof(struct ixgbe_adapter),
@@ -560,7 +560,7 @@ static struct eth_driver rte_ixgbevf_pmd
 	{
 		.name = "rte_ixgbevf_pmd",
 		.id_table = pci_id_ixgbevf_map,
-		.drv_flags = RTE_PCI_DRV_NEED_IGB_UIO,
+		.module_name = "igb_uio",
 	},
 	.eth_dev_init = eth_ixgbevf_dev_init,
 	.dev_private_size = sizeof(struct ixgbe_adapter),
--- a/app/test/test_pci.c	2013-03-28 08:50:50.000000000 -0700
+++ b/app/test/test_pci.c	2013-05-29 16:05:22.881670915 -0700
@@ -69,7 +69,7 @@ static int my_driver_init(struct rte_pci
 			  struct rte_pci_device *dev);
 
 /*
- * To test cases where RTE_PCI_DRV_NEED_IGB_UIO is set, and isn't set, two
+ * To test cases where module_name is set, and isn't set, two
  * drivers are created (one with IGB devices, the other with IXGBE devices).
  */
 
@@ -101,7 +101,7 @@ struct rte_pci_driver my_driver = {
 	.name = "test_driver",
 	.devinit = my_driver_init,
 	.id_table = my_driver_id,
-	.drv_flags = RTE_PCI_DRV_NEED_IGB_UIO,
+	.module_name = "igb_uio",
 };
 
 struct rte_pci_driver my_driver2 = {
--- a/app/chkincs/test_pci.c	2013-03-28 08:50:50.210414181 -0700
+++ b/app/chkincs/test_pci.c	2013-05-29 16:07:43.880238599 -0700
@@ -64,7 +64,8 @@ struct rte_pci_driver my_driver = {
 	"test_driver",
 	my_driver_init,
 	my_driver_id,
-	RTE_PCI_DRV_NEED_IGB_UIO,
+	0,
+	"igb_uio",
 };
 
 static int
--- a/lib/librte_pmd_igb/e1000_ethdev.c	2013-05-29 08:45:38.888471864 -0700
+++ b/lib/librte_pmd_igb/e1000_ethdev.c	2013-05-29 16:04:50.410003570 -0700
@@ -336,7 +336,7 @@ static struct eth_driver rte_igb_pmd = {
 	{
 		.name = "rte_igb_pmd",
 		.id_table = pci_id_igb_map,
-		.drv_flags = RTE_PCI_DRV_NEED_IGB_UIO,
+		.module_name = "igb_uio",
 	},
 	.eth_dev_init = eth_igb_dev_init,
 	.dev_private_size = sizeof(struct e1000_adapter),

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [dpdk-dev] [PATCH 5/7] pci: support multiple PCI regions per device
  2013-05-30 17:12 [dpdk-dev] [PATCH 0/7] Vyatta patches Stephen Hemminger
                   ` (3 preceding siblings ...)
  2013-05-30 17:12 ` [dpdk-dev] [PATCH 4/7] eal: support different modules Stephen Hemminger
@ 2013-05-30 17:12 ` Stephen Hemminger
  2013-06-03 16:41   ` Thomas Monjalon
  2013-06-05 14:50   ` Damien Millescamps
  2013-05-30 17:12 ` [dpdk-dev] [PATCH 6/7] igb_uio: pci_block_user_cfg_access is unsafe, remove it Stephen Hemminger
                   ` (4 subsequent siblings)
  9 siblings, 2 replies; 39+ messages in thread
From: Stephen Hemminger @ 2013-05-30 17:12 UTC (permalink / raw)
  To: dev

[-- Attachment #1: eal-pci-multiple.patch --]
[-- Type: text/plain, Size: 14007 bytes --]

---
 app/test-pmd/config.c                   |    2 
 app/test-pmd/testpmd.h                  |    8 -
 lib/librte_eal/common/eal_common_pci.c  |   11 +
 lib/librte_eal/common/include/rte_pci.h |    7 
 lib/librte_eal/linuxapp/eal/eal_pci.c   |  244 +++++++++++++++++++-------------
 lib/librte_pmd_ixgbe/ixgbe_ethdev.c     |    4 
 6 files changed, 169 insertions(+), 107 deletions(-)

--- a/lib/librte_eal/common/eal_common_pci.c	2013-05-29 08:45:38.000000000 -0700
+++ b/lib/librte_eal/common/eal_common_pci.c	2013-05-29 16:09:36.351108977 -0700
@@ -121,12 +121,19 @@ rte_eal_pci_probe(void)
 static int
 pci_dump_one_device(struct rte_pci_device *dev)
 {
+	int i;
+
 	printf(PCI_PRI_FMT, dev->addr.domain, dev->addr.bus,
 	       dev->addr.devid, dev->addr.function);
 	printf(" - vendor:%x device:%x\n", dev->id.vendor_id,
 	       dev->id.device_id);
-	printf("   %16.16"PRIx64" %16.16"PRIx64"\n",
-	       dev->mem_resource.phys_addr, dev->mem_resource.len);
+
+	for (i = 0; i < PCI_MEM_RESOURCE; i++) {
+		printf("   %16.16"PRIx64" %16.16"PRIx64"\n",
+		       dev->mem_resource[i].phys_addr,
+		       dev->mem_resource[i].len);
+	}
+
 	return 0;
 }
 
--- a/lib/librte_eal/common/include/rte_pci.h	2013-05-29 09:02:50.000000000 -0700
+++ b/lib/librte_eal/common/include/rte_pci.h	2013-05-29 16:30:28.456968882 -0700
@@ -50,6 +50,7 @@ extern "C" {
 #include <sys/queue.h>
 #include <stdint.h>
 #include <inttypes.h>
+#include <limits.h>
 #include <rte_interrupts.h>
 
 TAILQ_HEAD(pci_device_list, rte_pci_device); /**< PCI devices in D-linked Q. */
@@ -81,6 +82,8 @@ struct rte_pci_resource {
 
 /** Maximum number of PCI resources. */
 #define PCI_MAX_RESOURCE 7
+/** Maximum number of PCI memory resources. */
+#define PCI_MEM_RESOURCE 5
 
 /**
  * A structure describing an ID for a PCI driver. Each driver provides a
@@ -110,10 +113,12 @@ struct rte_pci_device {
 	TAILQ_ENTRY(rte_pci_device) next;       /**< Next probed PCI device. */
 	struct rte_pci_addr addr;               /**< PCI location. */
 	struct rte_pci_id id;                   /**< PCI ID. */
-	struct rte_pci_resource mem_resource;   /**< PCI Memory Resource */
+	struct rte_pci_resource mem_resource[PCI_MEM_RESOURCE];
+						/**< PCI Memory Resource */
 	struct rte_intr_handle intr_handle;     /**< Interrupt handle */
 	const struct rte_pci_driver *driver;    /**< Associated driver */
 	unsigned int blacklisted:1;             /**< Device is blacklisted */
+	char uio_name[PATH_MAX];		/**< Associated UIO device name */
 };
 
 /** Any PCI device identifier (vendor, device, ...) */
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c	2013-05-29 09:04:00.000000000 -0700
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c	2013-05-29 16:37:39.600742359 -0700
@@ -99,7 +99,6 @@ struct uio_resource {
 
 	struct rte_pci_addr pci_addr;
 	void *addr;
-	char path[PATH_MAX];
 	unsigned long size;
 	unsigned long offset;
 };
@@ -212,64 +211,108 @@ pci_uio_bind_device(struct rte_pci_devic
 	return 0;
 }
 
-/* map a particular resource from a file */
-static void *
-pci_map_resource(struct rte_pci_device *dev, void *requested_addr, const char *devname,
-		unsigned long offset, unsigned long size)
+/*
+ * open devname: it can take some time to
+ * appear, so we wait some time before returning an error
+ */
+static int uio_open(const char *devname)
 {
-	unsigned n;
-	int fd;
-	void *mapaddr;
+	int n, fd;
 
-	/*
-	 * open devname, and mmap it: it can take some time to
-	 * appear, so we wait some time before returning an error
-	 */
-	for (n=0; n<UIO_DEV_WAIT_TIMEOUT*10; n++) {
+	for (n=0; n < UIO_DEV_WAIT_TIMEOUT*10; n++) {
 		fd = open(devname, O_RDWR);
 		if (fd >= 0)
-			break;
+			return fd;
+
 		if (errno != ENOENT)
 			break;
 		usleep(100000);
 	}
-	if (fd < 0) {
-		RTE_LOG(ERR, EAL, "Cannot open %s: %s\n", devname, strerror(errno));
-		goto fail;
-	}
+	return -1;
+}
+
+/* map a particular resource from a file */
+static void *
+pci_mmap(int fd, void *addr, off_t offset, size_t size)
+{
+	void *mapaddr;
 
 	/* Map the PCI memory resource of device */
-	mapaddr = mmap(requested_addr, size, PROT_READ | PROT_WRITE,
-			MAP_SHARED, fd, offset);
-	if (mapaddr == MAP_FAILED ||
-			(requested_addr != NULL && mapaddr != requested_addr)) {
-		RTE_LOG(ERR, EAL, "%s(): cannot mmap %s: %s\n", __func__,
-			devname, strerror(errno));
-		close(fd);
-		goto fail;
-	}
-	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
-		/* save fd if in primary process */
-		dev->intr_handle.fd = fd;
-		dev->intr_handle.type = RTE_INTR_HANDLE_UIO;
-	} else {
-		/* fd is not needed in slave process, close it */
-		dev->intr_handle.fd = -1;
-		dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN;
-		close(fd);
+	mapaddr = mmap(addr, size, PROT_READ | PROT_WRITE, MAP_SHARED,
+		       fd, offset);
+	if (mapaddr == MAP_FAILED || (addr != NULL && mapaddr != addr)) {
+		RTE_LOG(ERR, EAL, "%s(): cannot mmap %zd@0x%lx: %s\n",
+			__func__, size, offset, strerror(errno));
+		return NULL;
 	}
 
 	RTE_LOG(DEBUG, EAL, "PCI memory mapped at %p\n", mapaddr);
-
 	return mapaddr;
+}
+
+/* save the mapping details for secondary processes*/
+static int pci_uio_map_save(const struct rte_pci_device *dev, void *mapaddr,
+			    unsigned long offset, unsigned long size)
+{
+	struct uio_resource *uio_res;
+
+	uio_res = rte_malloc("UIO_RES", sizeof(*uio_res), 0);
+	if (uio_res == NULL) {
+		RTE_LOG(ERR, EAL, "%s(): cannot store uio mmap details\n",
+			__func__);
+		return -1;
+	}
+
+	uio_res->addr = mapaddr;
+	uio_res->offset = offset;
+	uio_res->size = size;
+	memcpy(&uio_res->pci_addr, &dev->addr, sizeof(uio_res->pci_addr));
+
+	TAILQ_INSERT_TAIL(uio_res_list, uio_res, next);
+	return 0;
+}
+
+static int pci_uio_map_restore(struct rte_pci_device *dev)
+{
+	struct uio_resource *uio_res;
+	int i, fd;
+	void *addr;
 
-fail:
 	dev->intr_handle.fd = -1;
 	dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN;
 
-	return NULL;
+	fd = uio_open(dev->uio_name);
+	if (fd < 0)
+		return -1;
+
+	TAILQ_FOREACH(uio_res, uio_res_list, next) {
+		/* skip this element if it doesn't match our PCI address */
+		if (memcmp(&uio_res->pci_addr, &dev->addr, sizeof(dev->addr)))
+			continue;
+
+		for (i = 0; i < PCI_MEM_RESOURCE; i++) {
+			if (dev->mem_resource[i].len == 0)
+				continue;
+
+			addr = pci_mmap(fd, uio_res->addr,
+					uio_res->offset, uio_res->size);
+			if (addr != uio_res->addr) {
+				RTE_LOG(ERR, EAL, "Cannot mmap device resource\n");
+				close(fd);
+				return -1;
+			}
+		}
+
+		close(fd);
+		return 0;
+	}
+
+	RTE_LOG(ERR, EAL, "Cannot find resource for device\n");
+	close(fd);
+	return -1;
 }
-/* map the PCI resource of a PCI device in virtual memory */
+
+/* map the PCI resources of a PCI device in virtual memory */
 static int
 pci_uio_map_resource(struct rte_pci_device *dev)
 {
@@ -278,35 +321,20 @@ pci_uio_map_resource(struct rte_pci_devi
 	char dirname[PATH_MAX];
 	char dirname2[PATH_MAX];
 	char filename[PATH_MAX];
-	char devname[PATH_MAX]; /* contains the /dev/uioX */
+	int i, fd;
 	void *mapaddr;
 	unsigned uio_num;
-	unsigned long size, offset;
+	unsigned long size, offset, page_size;
 	struct rte_pci_addr *loc = &dev->addr;
-	struct uio_resource *uio_res;
+
+	page_size = sysconf(_SC_PAGE_SIZE);
 
 	RTE_LOG(DEBUG, EAL, "map PCI resource for device "PCI_PRI_FMT"\n",
 	        loc->domain, loc->bus, loc->devid, loc->function);
 
 	/* secondary processes - use already recorded details */
-	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
-
-			TAILQ_FOREACH(uio_res, uio_res_list, next) {
-				/* skip this element if it doesn't match our PCI address */
-				if (memcmp(&uio_res->pci_addr, &dev->addr, sizeof(dev->addr)))
-					continue;
-
-				if (pci_map_resource(dev, uio_res->addr, uio_res->path, \
-						uio_res->offset, uio_res->size) == uio_res->addr)
-					return 0;
-				else {
-					RTE_LOG(ERR, EAL, "Cannot mmap device resource\n");
-					return -1;
-				}
-			}
-			RTE_LOG(ERR, EAL, "Cannot find resource for device\n");
-			return -1;
-	}
+	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+		return pci_uio_map_restore(dev);
 
 	/* depending on kernel version, uio can be located in uio/uioX
 	 * or uio:uioX */
@@ -362,44 +390,59 @@ pci_uio_map_resource(struct rte_pci_devi
 	if (e == NULL)
 		return 0;
 
-	/* get mapping offset */
-	rte_snprintf(filename, sizeof(filename),
-		 "%s/maps/map0/offset", dirname2);
-	if (pci_parse_sysfs_value(filename, &offset) < 0) {
-		RTE_LOG(ERR, EAL, "%s(): cannot parse offset\n",
-			__func__);
-		return -1;
-	}
+	/* open /dev/uioX */
+	rte_snprintf(dev->uio_name, sizeof(dev->uio_name),
+		     "/dev/uio%u", uio_num);
 
-	/* get mapping size */
-	rte_snprintf(filename, sizeof(filename),
-		 "%s/maps/map0/size", dirname2);
-	if (pci_parse_sysfs_value(filename, &size) < 0) {
-		RTE_LOG(ERR, EAL, "%s(): cannot parse size\n",
-			__func__);
+	fd = uio_open(dev->uio_name);
+	if (fd < 0) {
+		RTE_LOG(ERR, EAL, "Cannot open %s: %s\n",
+			dev->uio_name, strerror(errno));
 		return -1;
 	}
 
-	/* open and mmap /dev/uioX */
-	rte_snprintf(devname, sizeof(devname), "/dev/uio%u", uio_num);
-	mapaddr = pci_map_resource(dev, NULL, devname, offset, size);
-	if (mapaddr == NULL)
-		return -1;
-	dev->mem_resource.addr = mapaddr;
+	/* map associated memory resources. */
+	for (i = 0; i < PCI_MEM_RESOURCE; i++) {
+		if (dev->mem_resource[i].len == 0)
+			continue;
 
-	/* save the mapping details for secondary processes*/
-	uio_res = rte_malloc("UIO_RES", sizeof(*uio_res), 0);
-	if (uio_res == NULL){
-		RTE_LOG(ERR, EAL, "%s(): cannot store uio mmap details\n", __func__);
-		return -1;
+		rte_snprintf(filename, sizeof(filename),
+			     "%s/maps/map%d/offset", dirname2, i);
+
+		if (access(filename, F_OK) < 0)
+			continue; /* this resource is not mapped via uio */
+
+		/* get mapping offset */
+		if (pci_parse_sysfs_value(filename, &offset) < 0) {
+			RTE_LOG(ERR, EAL, "%s(): cannot parse offset\n",
+				__func__);
+			return -1;
+		}
+
+		/* page number indicates which resource */
+		offset += i * page_size;
+
+		/* get mapping size */
+		rte_snprintf(filename, sizeof(filename),
+			     "%s/maps/map%d/size", dirname2, i);
+		if (pci_parse_sysfs_value(filename, &size) < 0) {
+			RTE_LOG(ERR, EAL, "%s(): cannot parse size\n",
+				__func__);
+			return -1;
+		}
+
+		mapaddr = pci_mmap(fd, NULL, offset, size);
+		if (mapaddr == NULL)
+			return -1;
+
+		dev->mem_resource[i].addr = mapaddr;
+		if (pci_uio_map_save(dev, mapaddr, offset, size) < 0)
+			return -1;
 	}
-	uio_res->addr = mapaddr;
-	uio_res->offset = offset;
-	uio_res->size = size;
-	rte_snprintf(uio_res->path, sizeof(uio_res->path), "%s", devname);
-	memcpy(&uio_res->pci_addr, &dev->addr, sizeof(uio_res->pci_addr));
 
-	TAILQ_INSERT_TAIL(uio_res_list, uio_res, next);
+	/* save fd if in primary process */
+	dev->intr_handle.fd = fd;
+	dev->intr_handle.type = RTE_INTR_HANDLE_UIO;
 
 	return 0;
 }
@@ -420,7 +463,7 @@ pci_parse_sysfs_resource(const char *fil
 		};
 		char *ptrs[PCI_RESOURCE_FMT_NVAL];
 	} res_info;
-	int i;
+	int i, m;
 	uint64_t phys_addr, end_addr, flags;
 
 	f = fopen(filename, "r");
@@ -429,6 +472,7 @@ pci_parse_sysfs_resource(const char *fil
 		return -1;
 	}
 
+	m = 0;
 	for (i = 0; i<PCI_MAX_RESOURCE; i++) {
 
 		if (fgets(buf, sizeof(buf), f) == NULL) {
@@ -450,10 +494,16 @@ pci_parse_sysfs_resource(const char *fil
 		}
 
 		if (flags & IORESOURCE_MEM) {
-			dev->mem_resource.phys_addr = phys_addr;
-			dev->mem_resource.len = end_addr - phys_addr + 1;
-			dev->mem_resource.addr = NULL; /* not mapped for now */
-			break;
+			if (m == PCI_MEM_RESOURCE) {
+				RTE_LOG(ERR, EAL, "%s(): too many memory resources\n",
+					__func__);
+				goto error;
+			}
+
+			dev->mem_resource[m].phys_addr = phys_addr;
+			dev->mem_resource[m].len = end_addr - phys_addr + 1;
+			dev->mem_resource[m].addr = NULL; /* not mapped for now */
+			++m;
 		}
 	}
 	fclose(f);
--- a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c	2013-05-29 16:08:07.544000027 -0700
+++ b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c	2013-05-29 16:09:36.355108938 -0700
@@ -369,7 +369,7 @@ eth_ixgbe_dev_init(__attribute__((unused
 	/* Vendor and Device ID need to be set before init of shared code */
 	hw->device_id = pci_dev->id.device_id;
 	hw->vendor_id = pci_dev->id.vendor_id;
-	hw->hw_addr = (void *)pci_dev->mem_resource.addr;
+	hw->hw_addr = pci_dev->mem_resource[0].addr;
 
 	/* Initialize the shared code */
 	diag = ixgbe_init_shared_code(hw);
@@ -490,7 +490,7 @@ eth_ixgbevf_dev_init(__attribute__((unus
 
 	hw->device_id = pci_dev->id.device_id;
 	hw->vendor_id = pci_dev->id.vendor_id;
-	hw->hw_addr = (void *)pci_dev->mem_resource.addr;
+	hw->hw_addr = pci_dev->mem_resource[0].addr;
 
 	/* Initialize the shared code */
 	diag = ixgbe_init_shared_code(hw);
--- a/app/test-pmd/config.c	2013-03-28 08:50:50.000000000 -0700
+++ b/app/test-pmd/config.c	2013-05-29 16:09:36.359108898 -0700
@@ -180,7 +180,7 @@ port_reg_off_is_invalid(portid_t port_id
 		       (unsigned)reg_off);
 		return 1;
 	}
-	pci_len = ports[port_id].dev_info.pci_dev->mem_resource.len;
+	pci_len = ports[port_id].dev_info.pci_dev->mem_resource[0].len;
 	if (reg_off >= pci_len) {
 		printf("Port %d: register offset %u (0x%X) out of port PCI "
 		       "resource (length=%"PRIu64")\n",
--- a/app/test-pmd/testpmd.h	2013-03-28 08:50:50.000000000 -0700
+++ b/app/test-pmd/testpmd.h	2013-05-29 16:17:33.550417158 -0700
@@ -304,8 +304,8 @@ port_pci_reg_read(struct rte_port *port,
 	void *reg_addr;
 	uint32_t reg_v;
 
-	reg_addr = (void *)((char *)port->dev_info.pci_dev->mem_resource.addr +
-			    reg_off);
+	reg_addr = (char *)port->dev_info.pci_dev->mem_resource[0].addr
+		+ reg_off;
 	reg_v = *((volatile uint32_t *)reg_addr);
 	return rte_le_to_cpu_32(reg_v);
 }
@@ -318,8 +318,8 @@ port_pci_reg_write(struct rte_port *port
 {
 	void *reg_addr;
 
-	reg_addr = (void *)((char *)port->dev_info.pci_dev->mem_resource.addr +
-			    reg_off);
+	reg_addr = (char *)port->dev_info.pci_dev->mem_resource[0].addr
+		    + reg_off;
 	*((volatile uint32_t *)reg_addr) = rte_cpu_to_le_32(reg_v);
 }
 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [dpdk-dev] [PATCH 6/7] igb_uio: pci_block_user_cfg_access is unsafe, remove it
  2013-05-30 17:12 [dpdk-dev] [PATCH 0/7] Vyatta patches Stephen Hemminger
                   ` (4 preceding siblings ...)
  2013-05-30 17:12 ` [dpdk-dev] [PATCH 5/7] pci: support multiple PCI regions per device Stephen Hemminger
@ 2013-05-30 17:12 ` Stephen Hemminger
  2013-05-30 17:12 ` [dpdk-dev] [PATCH 7/7] eal: add ability to override DPDK syslog parameters Stephen Hemminger
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 39+ messages in thread
From: Stephen Hemminger @ 2013-05-30 17:12 UTC (permalink / raw)
  To: dev

[-- Attachment #1: igb-remove-pci-lock.patch --]
[-- Type: text/plain, Size: 1954 bytes --]

Using pci_block_user_cfg_access in IRQ context is unsafe. In fact if kernel
is compiled with option to check for might_sleep() it causes a warning
and a backtrace. The problem Intel was trying to solve here can not be
solved with these functions; better to just remove them.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>


--- a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c	2013-03-28 08:50:50.234413869 -0700
+++ b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c	2013-05-29 16:41:30.974514849 -0700
@@ -28,15 +28,6 @@
 #include <linux/msi.h>
 #include <linux/version.h>
 
-/* Some function names changes between 3.2.0 and 3.3.0... */
-#if LINUX_VERSION_CODE < KERNEL_VERSION(3,3,0)
-#define PCI_LOCK pci_block_user_cfg_access
-#define PCI_UNLOCK pci_unblock_user_cfg_access
-#else
-#define PCI_LOCK pci_cfg_access_lock
-#define PCI_UNLOCK pci_cfg_access_unlock
-#endif
-
 /**
  * MSI-X related macros, copy from linux/pci_regs.h in kernel 2.6.39,
  * but none of them in kernel 2.6.35.
@@ -170,14 +161,9 @@ igbuio_pci_irqcontrol(struct uio_info *i
 {
 	unsigned long flags;
 	struct rte_uio_pci_dev *udev = igbuio_get_uio_pci_dev(info);
-	struct pci_dev *pdev = udev->pdev;
 
 	spin_lock_irqsave(&udev->lock, flags);
-	PCI_LOCK(pdev);
-
 	igbuio_set_interrupt_mask(udev, irq_state);
-
-	PCI_UNLOCK(pdev);
 	spin_unlock_irqrestore(&udev->lock, flags);
 
 	return 0;
@@ -198,8 +184,6 @@ igbuio_pci_irqhandler(int irq, struct ui
 	uint16_t status;
 
 	spin_lock_irqsave(&udev->lock, flags);
-	/* block userspace PCI config reads/writes */
-	PCI_LOCK(pdev);
 
 	/* for legacy mode, interrupt maybe shared */
 	if (udev->mode == IGBUIO_LEGACY_INTR_MODE) {
@@ -214,7 +198,6 @@ igbuio_pci_irqhandler(int irq, struct ui
 	ret = IRQ_HANDLED;
 done:
 	/* unblock userspace PCI config reads/writes */
-	PCI_UNLOCK(pdev);
 	spin_unlock_irqrestore(&udev->lock, flags);
 	printk(KERN_INFO "irq 0x%x %s\n", irq, (ret == IRQ_HANDLED) ? "handled" : "not handled");
 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [dpdk-dev] [PATCH 7/7] eal: add ability to override DPDK syslog parameters
  2013-05-30 17:12 [dpdk-dev] [PATCH 0/7] Vyatta patches Stephen Hemminger
                   ` (5 preceding siblings ...)
  2013-05-30 17:12 ` [dpdk-dev] [PATCH 6/7] igb_uio: pci_block_user_cfg_access is unsafe, remove it Stephen Hemminger
@ 2013-05-30 17:12 ` Stephen Hemminger
  2013-06-05 14:36   ` Vincent JARDIN
  2013-05-30 22:20 ` [dpdk-dev] [PATCH 0/7] Vyatta patches Thomas Monjalon
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 39+ messages in thread
From: Stephen Hemminger @ 2013-05-30 17:12 UTC (permalink / raw)
  To: dev

[-- Attachment #1: syslog.patch --]
[-- Type: text/plain, Size: 5916 bytes --]

By default, DPDK based applications would only allow logging
to syslog as "rte", DAEMON; but for any production application more
control is desired to allow using actual application name and
overriding the facility.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

---
 lib/librte_eal/common/include/eal_private.h            |    2 
 lib/librte_eal/linuxapp/eal/eal.c                      |   58 ++++++++++++++++-
 lib/librte_eal/linuxapp/eal/eal_log.c                  |    4 -
 lib/librte_eal/linuxapp/eal/include/eal_internal_cfg.h |    1 
 4 files changed, 61 insertions(+), 4 deletions(-)

--- a/lib/librte_eal/common/include/eal_private.h	2013-05-29 17:03:20.167850142 -0700
+++ b/lib/librte_eal/common/include/eal_private.h	2013-05-29 17:05:15.066310021 -0700
@@ -115,7 +115,7 @@ int rte_eal_log_early_init(void);
  * @return
  *   0 on success, negative on error
  */
-int rte_eal_log_init(void);
+int rte_eal_log_init(const char *id, int facility);
 
 /**
  * Init the default log stream
--- a/lib/librte_eal/linuxapp/eal/eal.c	2013-05-29 17:03:20.167850142 -0700
+++ b/lib/librte_eal/linuxapp/eal/eal.c	2013-05-29 17:12:09.317086933 -0700
@@ -40,6 +40,7 @@
 #include <stdarg.h>
 #include <unistd.h>
 #include <pthread.h>
+#include <syslog.h>
 #include <getopt.h>
 #include <fcntl.h>
 #include <stddef.h>
@@ -82,6 +83,7 @@
 #define OPT_NO_PCI      "no-pci"
 #define OPT_NO_HUGE     "no-huge"
 #define OPT_FILE_PREFIX "file-prefix"
+#define OPT_SYSLOG      "syslog"
 
 #define RTE_EAL_BLACKLIST_SIZE	0x100
 
@@ -265,6 +267,7 @@ eal_usage(const char *prgname)
 	       "               (multiple -b options are alowed)\n"
 	       "  -m MB      : memory to allocate (default = size of hugemem)\n"
 	       "  -r NUM     : force number of memory ranks (don't detect)\n"
+	       "  --"OPT_SYSLOG"   : set syslog facility\n"
 	       "  --"OPT_HUGE_DIR" : directory where hugetlbfs is mounted\n"
 	       "  --"OPT_PROC_TYPE": type of this process\n"
 	       "  --"OPT_FILE_PREFIX": prefix for hugepage filenames\n"
@@ -369,6 +372,45 @@ eal_parse_blacklist_opt(const char *opta
 	return (idx);
 }
 
+static int
+eal_parse_syslog(const char *facility)
+{
+	int i;
+	static struct {
+		const char *name;
+		int value;
+	} map[] = {
+		{ "auth", LOG_AUTH },
+		{ "cron", LOG_CRON },
+		{ "daemon", LOG_DAEMON },
+		{ "ftp", LOG_FTP },
+		{ "kern", LOG_KERN },
+		{ "lpr", LOG_LPR },
+		{ "mail", LOG_MAIL },
+		{ "news", LOG_NEWS },
+		{ "syslog", LOG_SYSLOG },
+		{ "user", LOG_USER },
+		{ "uucp", LOG_UUCP },
+		{ "local0", LOG_LOCAL0 },
+		{ "local1", LOG_LOCAL1 },
+		{ "local2", LOG_LOCAL2 },
+		{ "local3", LOG_LOCAL3 },
+		{ "local4", LOG_LOCAL4 },
+		{ "local5", LOG_LOCAL5 },
+		{ "local6", LOG_LOCAL6 },
+		{ "local7", LOG_LOCAL7 },
+		{ NULL, 0 }
+	};
+
+	for (i = 0; map[i].name; i++) {
+		if (!strcmp(facility, map[i].name)) {
+			internal_config.syslog_facility = map[i].value;
+			return 0;
+		}
+	}
+	return -1;
+}
+
 
 /* Parse the argument given in the command line of the application */
 static int
@@ -389,6 +431,7 @@ eal_parse_args(int argc, char **argv)
 		{OPT_NO_SHCONF, 0, 0, 0},
 		{OPT_PROC_TYPE, 1, 0, 0},
 		{OPT_FILE_PREFIX, 1, 0, 0},
+		{OPT_SYSLOG, 1, 0, 0},
 		{0, 0, 0, 0}
 	};
 
@@ -399,6 +442,7 @@ eal_parse_args(int argc, char **argv)
 	internal_config.force_nchannel = 0;
 	internal_config.hugefile_prefix = HUGEFILE_PREFIX_DEFAULT;
 	internal_config.hugepage_dir = NULL;
+	internal_config.syslog_facility = LOG_DAEMON;
 #ifdef RTE_LIBEAL_USE_HPET
 	internal_config.no_hpet = 0;
 #else
@@ -487,6 +531,14 @@ eal_parse_args(int argc, char **argv)
 			else if (!strcmp(lgopts[option_index].name, OPT_FILE_PREFIX)) {
 				internal_config.hugefile_prefix = optarg;
 			}
+			else if (!strcmp(lgopts[option_index].name, OPT_SYSLOG)) {
+				if (eal_parse_syslog(optarg) < 0) {
+					RTE_LOG(ERR, EAL, "invalid parameters for --"
+							OPT_SYSLOG "\n");
+					eal_usage(prgname);
+					return -1;
+				}
+			}
 			break;
 
 		default:
@@ -538,6 +590,10 @@ rte_eal_init(int argc, char **argv)
 {
 	int i, fctret, ret;
 	pthread_t thread_id;
+	const char *logid;
+
+	logid = strrchr(argv[0], '/');
+	logid = strdup(logid ? logid + 1: argv[0]);
 
 	thread_id = pthread_self();
 
@@ -585,7 +641,7 @@ rte_eal_init(int argc, char **argv)
 	if (rte_eal_tailqs_init() < 0)
 		rte_panic("Cannot init tail queues for objects\n");
 
-	if (rte_eal_log_init() < 0)
+	if (rte_eal_log_init(logid, internal_config.syslog_facility) < 0)
 		rte_panic("Cannot init logs\n");
 
 	if (rte_eal_alarm_init() < 0)
--- a/lib/librte_eal/linuxapp/eal/eal_log.c	2013-05-29 17:03:20.167850142 -0700
+++ b/lib/librte_eal/linuxapp/eal/eal_log.c	2013-05-29 17:05:15.066310021 -0700
@@ -119,7 +119,7 @@ static cookie_io_functions_t console_log
  * once memzones are available.
  */
 int
-rte_eal_log_init(void)
+rte_eal_log_init(const char *id, int facility)
 {
 	FILE *log_stream;
 
@@ -127,7 +127,7 @@ rte_eal_log_init(void)
 	if (log_stream == NULL)
 		return -1;
 
-	openlog("rte", LOG_NDELAY | LOG_PID, LOG_DAEMON);
+	openlog(id, LOG_NDELAY | LOG_PID, facility);
 
 	if (rte_eal_common_log_init(log_stream) < 0)
 		return -1;
--- a/lib/librte_eal/linuxapp/eal/include/eal_internal_cfg.h	2013-05-29 17:03:20.167850142 -0700
+++ b/lib/librte_eal/linuxapp/eal/include/eal_internal_cfg.h	2013-05-29 17:05:15.066310021 -0700
@@ -65,6 +65,7 @@ struct internal_config {
 	volatile unsigned no_hpet;        /* true to disable HPET */
 	volatile unsigned vmware_tsc_map; /* true to use VMware TSC mapping instead of native TSC */
 	volatile unsigned no_shconf;      /* true if there is no shared config */
+	volatile int syslog_facility;	  /* facility passed to openlog() */
 	volatile enum rte_proc_type_t process_type; /* multi-process proc type */
 	const char *hugefile_prefix;      /* the base filename of hugetlbfs files */
 	const char *hugepage_dir;         /* specific hugetlbfs directory to use */

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH 0/7] Vyatta patches
  2013-05-30 17:12 [dpdk-dev] [PATCH 0/7] Vyatta patches Stephen Hemminger
                   ` (6 preceding siblings ...)
  2013-05-30 17:12 ` [dpdk-dev] [PATCH 7/7] eal: add ability to override DPDK syslog parameters Stephen Hemminger
@ 2013-05-30 22:20 ` Thomas Monjalon
  2013-05-31  9:29 ` Damien Millescamps
  2013-06-03 15:22 ` Thomas Monjalon
  9 siblings, 0 replies; 39+ messages in thread
From: Thomas Monjalon @ 2013-05-30 22:20 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev

Hello,

30/05/2013 10:12, Stephen Hemminger :
> This is a subset of the Vyatta patches we use. They include some bug
> fixes and simple changes to make life easier.

Thank you

> One fix not included is the NUMA cpu assignment. The original Intel code
> in 1.2 incorrectly used /proc/cpuinfo to try and assign CPU's to NUMA
> socket. The problem is that /proc/cpuinfo physical_id corresponds to what
> the BIOS tells the kernel and is intended for messages only. For example,
> on our Dell boxes the first CPU and only CPU is reported as physical_id 1!
> The fix is to use sysfs instead, and Intel did incorporate my fix in the
> next DPDK version, and don't want to confuse this code base by putting in a
> conflicting change.

If I understand well, Intel could release this patch among other ones and they 
don't want to handle this patch separately.
Maybe we could apply it now but if it's not urgent and if Intel plan to 
release it soon, we may wait for the Intel drop.

-- 
Thomas

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH 0/7] Vyatta patches
  2013-05-30 17:12 [dpdk-dev] [PATCH 0/7] Vyatta patches Stephen Hemminger
                   ` (7 preceding siblings ...)
  2013-05-30 22:20 ` [dpdk-dev] [PATCH 0/7] Vyatta patches Thomas Monjalon
@ 2013-05-31  9:29 ` Damien Millescamps
  2013-05-31 15:45   ` Stephen Hemminger
  2013-06-03 15:22 ` Thomas Monjalon
  9 siblings, 1 reply; 39+ messages in thread
From: Damien Millescamps @ 2013-05-31  9:29 UTC (permalink / raw)
  To: dev

On 05/30/2013 07:12 PM, Stephen Hemminger wrote:
> One fix not included is the NUMA cpu assignment. The original Intel code
> in 1.2 incorrectly used /proc/cpuinfo to try and assign CPU's to NUMA socket.
> The problem is that /proc/cpuinfo physical_id corresponds to what the BIOS
> tells the kernel and is intended for messages only. For example, on our
> Dell boxes the first CPU and only CPU is reported as physical_id 1!
> The fix is to use sysfs instead,
Hi Stephen,

Are you using the
/sys/devices/system/node/nodeX/cpuX/topology/physical_package_id special
file ?

If so, then it is only usable starting from Kernel 3.3 according to this
fix:
http://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?id=64be4c1c2428e148de6081af235e2418e6a66dda

The value returned for kernel prior to 3.3 is really not better than the
one from /proc/cpuinfo, so that's kind of a robbing Peter to pay Paul
example...

There is obviously a problem with the NUMA node ID detection right now
since both /proc/cpuinfo and /sys can return incorrect values, however
the node and cpu numbering is always good in the kernel boot log from
what I know. So there might be a better way to find the real node ID
whatever the kernel version used.

Cheers,
-- 
Damien Millescamps

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH 0/7] Vyatta patches
  2013-05-31  9:29 ` Damien Millescamps
@ 2013-05-31 15:45   ` Stephen Hemminger
  2013-05-31 16:44     ` Damien Millescamps
  0 siblings, 1 reply; 39+ messages in thread
From: Stephen Hemminger @ 2013-05-31 15:45 UTC (permalink / raw)
  To: Damien Millescamps; +Cc: dev

On Fri, 31 May 2013 11:29:09 +0200
Damien Millescamps <damien.millescamps@6wind.com> wrote:

> On 05/30/2013 07:12 PM, Stephen Hemminger wrote:
> > One fix not included is the NUMA cpu assignment. The original Intel code
> > in 1.2 incorrectly used /proc/cpuinfo to try and assign CPU's to NUMA socket.
> > The problem is that /proc/cpuinfo physical_id corresponds to what the BIOS
> > tells the kernel and is intended for messages only. For example, on our
> > Dell boxes the first CPU and only CPU is reported as physical_id 1!
> > The fix is to use sysfs instead,
> Hi Stephen,
> 
> Are you using the
> /sys/devices/system/node/nodeX/cpuX/topology/physical_package_id special
> file ?
> 
> If so, then it is only usable starting from Kernel 3.3 according to this
> fix:
> http://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?id=64be4c1c2428e148de6081af235e2418e6a66dda
> 
> The value returned for kernel prior to 3.3 is really not better than the
> one from /proc/cpuinfo, so that's kind of a robbing Peter to pay Paul
> example...
> 
> There is obviously a problem with the NUMA node ID detection right now
> since both /proc/cpuinfo and /sys can return incorrect values, however
> the node and cpu numbering is always good in the kernel boot log from
> what I know. So there might be a better way to find the real node ID
> whatever the kernel version used.
> 
> Cheers,

You need to use /sys/devices/system/cpu/cpuN/topology/physical_package_id
as documented in kernel (Documentation/cputopology.txt).
This was confirmed by several kernel developers including Andi Kleen from
Intel. The value in /proc/cpuinfo comes from the APCI tables and is the value
reported by the BIOS. There are machines that report socket 1 and 2.
Haven't played with older kernels, but the topology information in sysfs
has existed since 2.6.16.

The final Intel solution was to use physical_packate_id and fall back
to /proc/cpuinfo as last resort.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH 0/7] Vyatta patches
  2013-05-31 15:45   ` Stephen Hemminger
@ 2013-05-31 16:44     ` Damien Millescamps
  2013-05-31 17:00       ` Stephen Hemminger
  0 siblings, 1 reply; 39+ messages in thread
From: Damien Millescamps @ 2013-05-31 16:44 UTC (permalink / raw)
  To: dev

On 05/31/2013 05:45 PM, Stephen Hemminger wrote:
> This was confirmed by several kernel developers including Andi Kleen from
> Intel. The value in /proc/cpuinfo comes from the APCI tables and is the value
> reported by the BIOS. There are machines that report socket 1 and 2.
> Haven't played with older kernels, but the topology information in sysfs
> has existed since 2.6.16.
Hi Stephen,

You didn't really address my point, because the Linux documentation says
something doesn't mean it works in the real world.

If you take the example of the hugepages, they have been in the Linux
kernel since 2.6.28, but the last fix was pushed in 2.6.37 and before
2.6.33 it is just impossible to do something with hugepages.
If you take a look at section 2.3.2 of the DPDK getting started:
http://www.intel.com/content/dam/www/public/us/en/documents/guides/intel-dpdk-getting-started-guide.pdf,
you will find that this issue is at least partially addressed.

So the point is that the topology information being theoretically
correct on all kernel Linux supported by the DPDK, but the information
being practically wrong on versions prior to kernel 3.3 means that the
"fix" is breaking the support for a quite huge range of kernel versions...
So maybe there might be a better way to get this node information on all
kernel still currently supported by the DPDK.

Cheers,
-- 
Damien Millescamps

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH 0/7] Vyatta patches
  2013-05-31 16:44     ` Damien Millescamps
@ 2013-05-31 17:00       ` Stephen Hemminger
  0 siblings, 0 replies; 39+ messages in thread
From: Stephen Hemminger @ 2013-05-31 17:00 UTC (permalink / raw)
  To: Damien Millescamps; +Cc: dev

On Fri, 31 May 2013 18:44:03 +0200
Damien Millescamps <damien.millescamps@6wind.com> wrote:

> On 05/31/2013 05:45 PM, Stephen Hemminger wrote:
> > This was confirmed by several kernel developers including Andi Kleen from
> > Intel. The value in /proc/cpuinfo comes from the APCI tables and is the value
> > reported by the BIOS. There are machines that report socket 1 and 2.
> > Haven't played with older kernels, but the topology information in sysfs
> > has existed since 2.6.16.
> Hi Stephen,
> 
> You didn't really address my point, because the Linux documentation says
> something doesn't mean it works in the real world.
> 
> If you take the example of the hugepages, they have been in the Linux
> kernel since 2.6.28, but the last fix was pushed in 2.6.37 and before
> 2.6.33 it is just impossible to do something with hugepages.
> If you take a look at section 2.3.2 of the DPDK getting started:
> http://www.intel.com/content/dam/www/public/us/en/documents/guides/intel-dpdk-getting-started-guide.pdf,
> you will find that this issue is at least partially addressed.
> 
> So the point is that the topology information being theoretically
> correct on all kernel Linux supported by the DPDK, but the information
> being practically wrong on versions prior to kernel 3.3 means that the
> "fix" is breaking the support for a quite huge range of kernel versions...
> So maybe there might be a better way to get this node information on all
> kernel still currently supported by the DPDK.
> 
> Cheers,

I will just state what we saw. /proc/cpuinfo was wrong on several Dell machines
but sysfs is correct. And since are working on a product, not generic
DPDK, there is no motivation to work on older kernels.

I can dig out the email back and forth with the kernel developers to show that
/proc/cpuinfo is not intended to report anything useful for NUMA related API
if you want.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH 4/7] eal: support different modules
  2013-05-30 17:12 ` [dpdk-dev] [PATCH 4/7] eal: support different modules Stephen Hemminger
@ 2013-06-03  8:58   ` Damien Millescamps
  2013-06-03 15:41     ` Stephen Hemminger
  2013-06-03 16:08     ` Antti Kantee
  0 siblings, 2 replies; 39+ messages in thread
From: Damien Millescamps @ 2013-06-03  8:58 UTC (permalink / raw)
  To: dev

[-- Attachment #1: Type: text/plain, Size: 5791 bytes --]

comments inlined.

On 05/30/2013 07:12 PM, Stephen Hemminger wrote:
> An additional change is that failure to detect module should not be fatal,
> but an error like other EAL setup problems, allowing the application
> to recover.
>
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
>
> ---
>  app/chkincs/test_pci.c                  |    3 ++-
>  app/test/test_pci.c                     |    4 ++--
>  lib/librte_eal/common/include/rte_pci.h |    5 ++---
>  lib/librte_eal/linuxapp/eal/eal_pci.c   |    9 ++-------
>  lib/librte_pmd_igb/e1000_ethdev.c       |    2 +-
>  lib/librte_pmd_ixgbe/ixgbe_ethdev.c     |    4 ++--
>  6 files changed, 11 insertions(+), 16 deletions(-)
>
> --- a/lib/librte_eal/common/include/rte_pci.h	2013-05-29 08:45:38.000000000 -0700
> +++ b/lib/librte_eal/common/include/rte_pci.h	2013-05-29 09:02:50.000000000 -0700
> @@ -151,12 +151,11 @@ struct rte_pci_driver {
>  	pci_devinit_t *devinit;                 /**< Device init. function. */
>  	struct rte_pci_id *id_table;            /**< ID table, NULL terminated. */
>  	uint32_t drv_flags;                     /**< Flags contolling handling of device. */
> +	const char *module_name;		/**< Associated kernel module */

You solution only permits for one module to be checked during
initialization, while the former solution using flags could be easily
extended to check for more than one module.
However it is true that there is a problem with this module check since
it is historically hard-coded for "igb_uio".

Some better solution could be to have a new flag that will check for a
NULL terminated array of module_name.

>  };
>  
> -/** Device needs igb_uio kernel module */
> -#define RTE_PCI_DRV_NEED_IGB_UIO 0x0001
>  /** Device driver must be registered several times until failure */
> -#define RTE_PCI_DRV_MULTIPLE 0x0002
> +#define RTE_PCI_DRV_MULTIPLE 0x0001
You are breaking a public API here, and I don't see any technical reason
to do so. The RTE_PCI_DRV_NEED_IGB_UIO flag could be deprecated, but
there is no way its value could be recycled into an already existing flag.

>  
>  /**
>   * Probe the PCI bus for registered drivers.
> --- a/lib/librte_eal/linuxapp/eal/eal_pci.c	2013-03-28 08:50:50.000000000 -0700
> +++ b/lib/librte_eal/linuxapp/eal/eal_pci.c	2013-05-29 09:04:00.000000000 -0700
> @@ -76,7 +76,7 @@
>   * This code is used to simulate a PCI probe by parsing information in
>   * sysfs. Moreover, when a registered driver matches a device, the
>   * kernel driver currently using it is unloaded and replaced by
> - * igb_uio module, which is a very minimal userland driver for Intel
> + * a module, which is a very minimal userland driver for Intel
>   * network card, only providing access to PCI BAR to applications, and
>   * enabling bus master.
>   */
> @@ -84,8 +84,6 @@
>  
>  #define PROC_MODULES "/proc/modules"
>  
> -#define IGB_UIO_NAME "igb_uio"
> -
>  #define UIO_NEWID "/sys/bus/pci/drivers/%s/new_id"
>  #define UIO_BIND  "/sys/bus/pci/drivers/%s/bind"
>  
> @@ -700,12 +698,9 @@ int
>  rte_eal_pci_probe_one_driver(struct rte_pci_driver *dr, struct rte_pci_device *dev)
>  {
>  	struct rte_pci_id *id_table;
> -	const char *module_name = NULL;
> +	const char *module_name = dr->module_name;
>  	int uio_status = -1;
>  
> -	if (dr->drv_flags & RTE_PCI_DRV_NEED_IGB_UIO)
> -		module_name = IGB_UIO_NAME;
> -
>  	uio_status = pci_uio_check_module(module_name);
>  
>  	for (id_table = dr->id_table ; id_table->vendor_id != 0; id_table++) {
> --- a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c	2013-05-29 08:45:38.000000000 -0700
> +++ b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c	2013-05-29 16:08:07.544000027 -0700
> @@ -547,7 +547,7 @@ static struct eth_driver rte_ixgbe_pmd =
>  	{
>  		.name = "rte_ixgbe_pmd",
>  		.id_table = pci_id_ixgbe_map,
> -		.drv_flags = RTE_PCI_DRV_NEED_IGB_UIO,
> +		.module_name = "igb_uio",
>  	},
>  	.eth_dev_init = eth_ixgbe_dev_init,
>  	.dev_private_size = sizeof(struct ixgbe_adapter),
> @@ -560,7 +560,7 @@ static struct eth_driver rte_ixgbevf_pmd
>  	{
>  		.name = "rte_ixgbevf_pmd",
>  		.id_table = pci_id_ixgbevf_map,
> -		.drv_flags = RTE_PCI_DRV_NEED_IGB_UIO,
> +		.module_name = "igb_uio",
>  	},
>  	.eth_dev_init = eth_ixgbevf_dev_init,
>  	.dev_private_size = sizeof(struct ixgbe_adapter),
> --- a/app/test/test_pci.c	2013-03-28 08:50:50.000000000 -0700
> +++ b/app/test/test_pci.c	2013-05-29 16:05:22.881670915 -0700
> @@ -69,7 +69,7 @@ static int my_driver_init(struct rte_pci
>  			  struct rte_pci_device *dev);
>  
>  /*
> - * To test cases where RTE_PCI_DRV_NEED_IGB_UIO is set, and isn't set, two
> + * To test cases where module_name is set, and isn't set, two
>   * drivers are created (one with IGB devices, the other with IXGBE devices).
>   */
>  
> @@ -101,7 +101,7 @@ struct rte_pci_driver my_driver = {
>  	.name = "test_driver",
>  	.devinit = my_driver_init,
>  	.id_table = my_driver_id,
> -	.drv_flags = RTE_PCI_DRV_NEED_IGB_UIO,
> +	.module_name = "igb_uio",
>  };
>  
>  struct rte_pci_driver my_driver2 = {
> --- a/app/chkincs/test_pci.c	2013-03-28 08:50:50.210414181 -0700
> +++ b/app/chkincs/test_pci.c	2013-05-29 16:07:43.880238599 -0700
> @@ -64,7 +64,8 @@ struct rte_pci_driver my_driver = {
>  	"test_driver",
>  	my_driver_init,
>  	my_driver_id,
> -	RTE_PCI_DRV_NEED_IGB_UIO,
> +	0,
> +	"igb_uio",
>  };
>  
>  static int
> --- a/lib/librte_pmd_igb/e1000_ethdev.c	2013-05-29 08:45:38.888471864 -0700
> +++ b/lib/librte_pmd_igb/e1000_ethdev.c	2013-05-29 16:04:50.410003570 -0700
> @@ -336,7 +336,7 @@ static struct eth_driver rte_igb_pmd = {
>  	{
>  		.name = "rte_igb_pmd",
>  		.id_table = pci_id_igb_map,
> -		.drv_flags = RTE_PCI_DRV_NEED_IGB_UIO,
> +		.module_name = "igb_uio",
>  	},
>  	.eth_dev_init = eth_igb_dev_init,
>  	.dev_private_size = sizeof(struct e1000_adapter),
>
>


[-- Attachment #2: Type: text/html, Size: 7227 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH 0/7] Vyatta patches
  2013-05-30 17:12 [dpdk-dev] [PATCH 0/7] Vyatta patches Stephen Hemminger
                   ` (8 preceding siblings ...)
  2013-05-31  9:29 ` Damien Millescamps
@ 2013-06-03 15:22 ` Thomas Monjalon
  9 siblings, 0 replies; 39+ messages in thread
From: Thomas Monjalon @ 2013-06-03 15:22 UTC (permalink / raw)
  To: dev

30/05/2013 19:12, Stephen Hemminger :
> This is a subset of the Vyatta patches we use. They include some bug
> fixes and simple changes to make life easier.

It would be nice to have these 7 patches in the next release (dpdk-1.2.3r3).
Reviewers are welcomed !

-- 
Thomas

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH 4/7] eal: support different modules
  2013-06-03  8:58   ` Damien Millescamps
@ 2013-06-03 15:41     ` Stephen Hemminger
  2013-06-03 16:36       ` Thomas Monjalon
  2013-06-03 16:08     ` Antti Kantee
  1 sibling, 1 reply; 39+ messages in thread
From: Stephen Hemminger @ 2013-06-03 15:41 UTC (permalink / raw)
  To: Damien Millescamps; +Cc: dev

On Mon, 03 Jun 2013 10:58:01 +0200
Damien Millescamps <damien.millescamps@6wind.com> wrote:

> comments inlined.
> 
> On 05/30/2013 07:12 PM, Stephen Hemminger wrote:
> > An additional change is that failure to detect module should not be fatal,
> > but an error like other EAL setup problems, allowing the application
> > to recover.
> >
> > Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
> >
> > ---
> >  app/chkincs/test_pci.c                  |    3 ++-
> >  app/test/test_pci.c                     |    4 ++--
> >  lib/librte_eal/common/include/rte_pci.h |    5 ++---
> >  lib/librte_eal/linuxapp/eal/eal_pci.c   |    9 ++-------
> >  lib/librte_pmd_igb/e1000_ethdev.c       |    2 +-
> >  lib/librte_pmd_ixgbe/ixgbe_ethdev.c     |    4 ++--
> >  6 files changed, 11 insertions(+), 16 deletions(-)
> >
> > --- a/lib/librte_eal/common/include/rte_pci.h	2013-05-29 08:45:38.000000000 -0700
> > +++ b/lib/librte_eal/common/include/rte_pci.h	2013-05-29 09:02:50.000000000 -0700
> > @@ -151,12 +151,11 @@ struct rte_pci_driver {
> >  	pci_devinit_t *devinit;                 /**< Device init. function. */
> >  	struct rte_pci_id *id_table;            /**< ID table, NULL terminated. */
> >  	uint32_t drv_flags;                     /**< Flags contolling handling of device. */
> > +	const char *module_name;		/**< Associated kernel module */
> 
> You solution only permits for one module to be checked during
> initialization, while the former solution using flags could be easily
> extended to check for more than one module.
> However it is true that there is a problem with this module check since
> it is historically hard-coded for "igb_uio".
> 
> Some better solution could be to have a new flag that will check for a
> NULL terminated array of module_name.
> 
> >  };
> >  
> > -/** Device needs igb_uio kernel module */
> > -#define RTE_PCI_DRV_NEED_IGB_UIO 0x0001
> >  /** Device driver must be registered several times until failure */
> > -#define RTE_PCI_DRV_MULTIPLE 0x0002
> > +#define RTE_PCI_DRV_MULTIPLE 0x0001
> You are breaking a public API here, and I don't see any technical reason
> to do so. The RTE_PCI_DRV_NEED_IGB_UIO flag could be deprecated, but
> there is no way its value could be recycled into an already existing flag.
> 
> >  
> >  /**
> >   * Probe the PCI bus for registered drivers.
> > --- a/lib/librte_eal/linuxapp/eal/eal_pci.c	2013-03-28 08:50:50.000000000 -0700
> > +++ b/lib/librte_eal/linuxapp/eal/eal_pci.c	2013-05-29 09:04:00.000000000 -0700
> > @@ -76,7 +76,7 @@
> >   * This code is used to simulate a PCI probe by parsing information in
> >   * sysfs. Moreover, when a registered driver matches a device, the
> >   * kernel driver currently using it is unloaded and replaced by
> > - * igb_uio module, which is a very minimal userland driver for Intel
> > + * a module, which is a very minimal userland driver for Intel
> >   * network card, only providing access to PCI BAR to applications, and
> >   * enabling bus master.
> >   */
> > @@ -84,8 +84,6 @@
> >  
> >  #define PROC_MODULES "/proc/modules"
> >  
> > -#define IGB_UIO_NAME "igb_uio"
> > -
> >  #define UIO_NEWID "/sys/bus/pci/drivers/%s/new_id"
> >  #define UIO_BIND  "/sys/bus/pci/drivers/%s/bind"
> >  
> > @@ -700,12 +698,9 @@ int
> >  rte_eal_pci_probe_one_driver(struct rte_pci_driver *dr, struct rte_pci_device *dev)
> >  {
> >  	struct rte_pci_id *id_table;
> > -	const char *module_name = NULL;
> > +	const char *module_name = dr->module_name;
> >  	int uio_status = -1;
> >  
> > -	if (dr->drv_flags & RTE_PCI_DRV_NEED_IGB_UIO)
> > -		module_name = IGB_UIO_NAME;
> > -
> >  	uio_status = pci_uio_check_module(module_name);
> >  
> >  	for (id_table = dr->id_table ; id_table->vendor_id != 0; id_table++) {
> > --- a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c	2013-05-29 08:45:38.000000000 -0700
> > +++ b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c	2013-05-29 16:08:07.544000027 -0700
> > @@ -547,7 +547,7 @@ static struct eth_driver rte_ixgbe_pmd =
> >  	{
> >  		.name = "rte_ixgbe_pmd",
> >  		.id_table = pci_id_ixgbe_map,
> > -		.drv_flags = RTE_PCI_DRV_NEED_IGB_UIO,
> > +		.module_name = "igb_uio",
> >  	},
> >  	.eth_dev_init = eth_ixgbe_dev_init,
> >  	.dev_private_size = sizeof(struct ixgbe_adapter),
> > @@ -560,7 +560,7 @@ static struct eth_driver rte_ixgbevf_pmd
> >  	{
> >  		.name = "rte_ixgbevf_pmd",
> >  		.id_table = pci_id_ixgbevf_map,
> > -		.drv_flags = RTE_PCI_DRV_NEED_IGB_UIO,
> > +		.module_name = "igb_uio",
> >  	},
> >  	.eth_dev_init = eth_ixgbevf_dev_init,
> >  	.dev_private_size = sizeof(struct ixgbe_adapter),
> > --- a/app/test/test_pci.c	2013-03-28 08:50:50.000000000 -0700
> > +++ b/app/test/test_pci.c	2013-05-29 16:05:22.881670915 -0700
> > @@ -69,7 +69,7 @@ static int my_driver_init(struct rte_pci
> >  			  struct rte_pci_device *dev);
> >  
> >  /*
> > - * To test cases where RTE_PCI_DRV_NEED_IGB_UIO is set, and isn't set, two
> > + * To test cases where module_name is set, and isn't set, two
> >   * drivers are created (one with IGB devices, the other with IXGBE devices).
> >   */
> >  
> > @@ -101,7 +101,7 @@ struct rte_pci_driver my_driver = {
> >  	.name = "test_driver",
> >  	.devinit = my_driver_init,
> >  	.id_table = my_driver_id,
> > -	.drv_flags = RTE_PCI_DRV_NEED_IGB_UIO,
> > +	.module_name = "igb_uio",
> >  };
> >  
> >  struct rte_pci_driver my_driver2 = {
> > --- a/app/chkincs/test_pci.c	2013-03-28 08:50:50.210414181 -0700
> > +++ b/app/chkincs/test_pci.c	2013-05-29 16:07:43.880238599 -0700
> > @@ -64,7 +64,8 @@ struct rte_pci_driver my_driver = {
> >  	"test_driver",
> >  	my_driver_init,
> >  	my_driver_id,
> > -	RTE_PCI_DRV_NEED_IGB_UIO,
> > +	0,
> > +	"igb_uio",
> >  };
> >  
> >  static int
> > --- a/lib/librte_pmd_igb/e1000_ethdev.c	2013-05-29 08:45:38.888471864 -0700
> > +++ b/lib/librte_pmd_igb/e1000_ethdev.c	2013-05-29 16:04:50.410003570 -0700
> > @@ -336,7 +336,7 @@ static struct eth_driver rte_igb_pmd = {
> >  	{
> >  		.name = "rte_igb_pmd",
> >  		.id_table = pci_id_igb_map,
> > -		.drv_flags = RTE_PCI_DRV_NEED_IGB_UIO,
> > +		.module_name = "igb_uio",
> >  	},
> >  	.eth_dev_init = eth_igb_dev_init,
> >  	.dev_private_size = sizeof(struct e1000_adapter),
> >
> >
> 

It is one module per driver which works quite well.
The code identifies the module to load based on the PCI id.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH 4/7] eal: support different modules
  2013-06-03  8:58   ` Damien Millescamps
  2013-06-03 15:41     ` Stephen Hemminger
@ 2013-06-03 16:08     ` Antti Kantee
  2013-06-03 16:29       ` Thomas Monjalon
  1 sibling, 1 reply; 39+ messages in thread
From: Antti Kantee @ 2013-06-03 16:08 UTC (permalink / raw)
  To: dev

On 03.06.2013 10:58, Damien Millescamps wrote:
>>   };
>>
>> -/** Device needs igb_uio kernel module */
>> -#define RTE_PCI_DRV_NEED_IGB_UIO 0x0001
>>   /** Device driver must be registered several times until failure */
>> -#define RTE_PCI_DRV_MULTIPLE 0x0002
>> +#define RTE_PCI_DRV_MULTIPLE 0x0001
> You are breaking a public API here, and I don't see any technical reason
> to do so. The RTE_PCI_DRV_NEED_IGB_UIO flag could be deprecated, but
> there is no way its value could be recycled into an already existing flag.

Is breaking the API a bad thing in this context?  IMHO the 
initialization APIs need work before they're general enough and 
perpetually supporting the current ones seems like an unnecessary 
burden.  I'm trying to understand the general guidelines of the project.

(and nittily, recycling flag values is fine for static-only libs as long 
as you remove the old macro, but of course removal is the API breakage 
you mentioned)

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH 4/7] eal: support different modules
  2013-06-03 16:08     ` Antti Kantee
@ 2013-06-03 16:29       ` Thomas Monjalon
  2013-06-03 17:25         ` Stephen Hemminger
  2013-06-03 18:40         ` Antti Kantee
  0 siblings, 2 replies; 39+ messages in thread
From: Thomas Monjalon @ 2013-06-03 16:29 UTC (permalink / raw)
  To: Antti Kantee; +Cc: dev

03/06/2013 18:08, Antti Kantee :
> On 03.06.2013 10:58, Damien Millescamps wrote:
> >> -/** Device needs igb_uio kernel module */
> >> -#define RTE_PCI_DRV_NEED_IGB_UIO 0x0001
> >> 
> >>   /** Device driver must be registered several times until failure */
> >> 
> >> -#define RTE_PCI_DRV_MULTIPLE 0x0002
> >> +#define RTE_PCI_DRV_MULTIPLE 0x0001
> > 
> > You are breaking a public API here, and I don't see any technical reason
> > to do so. The RTE_PCI_DRV_NEED_IGB_UIO flag could be deprecated, but
> > there is no way its value could be recycled into an already existing
> > flag.
> 
> Is breaking the API a bad thing in this context?  IMHO the
> initialization APIs need work before they're general enough and
> perpetually supporting the current ones seems like an unnecessary
> burden.  I'm trying to understand the general guidelines of the project.
> 
> (and nittily, recycling flag values is fine for static-only libs as long
> as you remove the old macro, but of course removal is the API breakage
> you mentioned)

Yes, DPDK is a young project but breaking API should be always justified.
In this case it is not mandatory to change it.

-- 
Thomas

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH 4/7] eal: support different modules
  2013-06-03 15:41     ` Stephen Hemminger
@ 2013-06-03 16:36       ` Thomas Monjalon
  2013-06-03 17:26         ` Stephen Hemminger
  0 siblings, 1 reply; 39+ messages in thread
From: Thomas Monjalon @ 2013-06-03 16:36 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev

03/06/2013 17:41, Stephen Hemminger :
> On Mon, 03 Jun 2013 10:58:01 +0200
> Damien Millescamps <damien.millescamps@6wind.com> wrote:
> > > 
> > > --- a/lib/librte_eal/common/include/rte_pci.h	2013-05-29
> > > 08:45:38.000000000 -0700 +++
> > > b/lib/librte_eal/common/include/rte_pci.h	2013-05-29
> > > 09:02:50.000000000 -0700 @@ -151,12 +151,11 @@ struct rte_pci_driver {
> > > 
> > >  	pci_devinit_t *devinit;                 /**< Device init. function.
> > >  	*/ struct rte_pci_id *id_table;            /**< ID table, NULL
> > >  	terminated. */ uint32_t drv_flags;                     /**< Flags
> > >  	contolling handling of device. */
> > > 
> > > +	const char *module_name;		/**< Associated kernel module */
> > 
> > You solution only permits for one module to be checked during
> > initialization, while the former solution using flags could be easily
> > extended to check for more than one module.
> > However it is true that there is a problem with this module check since
> > it is historically hard-coded for "igb_uio".
> 
> It is one module per driver which works quite well.
> The code identifies the module to load based on the PCI id.

There are cases where we need more than one kernel module for 2 reasons:
	- in this case, igb_uio is built and inserted without dependency on uio
	- some other PMDs/NICs, several modules without dependency links could be 
needed

Please could you refactor your patch accordingly ?
Thanks
-- 
Thomas

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH 5/7] pci: support multiple PCI regions per device
  2013-05-30 17:12 ` [dpdk-dev] [PATCH 5/7] pci: support multiple PCI regions per device Stephen Hemminger
@ 2013-06-03 16:41   ` Thomas Monjalon
  2013-06-05 14:50   ` Damien Millescamps
  1 sibling, 0 replies; 39+ messages in thread
From: Thomas Monjalon @ 2013-06-03 16:41 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev

Please, could you provide a commit log to better explain your patch ?

-- 
Thomas

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH 4/7] eal: support different modules
  2013-06-03 16:29       ` Thomas Monjalon
@ 2013-06-03 17:25         ` Stephen Hemminger
  2013-06-03 18:40         ` Antti Kantee
  1 sibling, 0 replies; 39+ messages in thread
From: Stephen Hemminger @ 2013-06-03 17:25 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

On Mon, 3 Jun 2013 18:29:02 +0200
Thomas Monjalon <thomas.monjalon@6wind.com> wrote:

> 03/06/2013 18:08, Antti Kantee :
> > On 03.06.2013 10:58, Damien Millescamps wrote:
> > >> -/** Device needs igb_uio kernel module */
> > >> -#define RTE_PCI_DRV_NEED_IGB_UIO 0x0001
> > >> 
> > >>   /** Device driver must be registered several times until failure */
> > >> 
> > >> -#define RTE_PCI_DRV_MULTIPLE 0x0002
> > >> +#define RTE_PCI_DRV_MULTIPLE 0x0001
> > > 
> > > You are breaking a public API here, and I don't see any technical reason
> > > to do so. The RTE_PCI_DRV_NEED_IGB_UIO flag could be deprecated, but
> > > there is no way its value could be recycled into an already existing
> > > flag.
> > 
> > Is breaking the API a bad thing in this context?  IMHO the
> > initialization APIs need work before they're general enough and
> > perpetually supporting the current ones seems like an unnecessary
> > burden.  I'm trying to understand the general guidelines of the project.
> > 
> > (and nittily, recycling flag values is fine for static-only libs as long
> > as you remove the old macro, but of course removal is the API breakage
> > you mentioned)
> 
> Yes, DPDK is a young project but breaking API should be always justified.
> In this case it is not mandatory to change it.
> 

This is a source project, there is no fixed ABI.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH 4/7] eal: support different modules
  2013-06-03 16:36       ` Thomas Monjalon
@ 2013-06-03 17:26         ` Stephen Hemminger
  2013-06-04  9:17           ` Damien Millescamps
  0 siblings, 1 reply; 39+ messages in thread
From: Stephen Hemminger @ 2013-06-03 17:26 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

On Mon, 3 Jun 2013 18:36:52 +0200
Thomas Monjalon <thomas.monjalon@6wind.com> wrote:

> 03/06/2013 17:41, Stephen Hemminger :
> > On Mon, 03 Jun 2013 10:58:01 +0200
> > Damien Millescamps <damien.millescamps@6wind.com> wrote:
> > > > 
> > > > --- a/lib/librte_eal/common/include/rte_pci.h	2013-05-29
> > > > 08:45:38.000000000 -0700 +++
> > > > b/lib/librte_eal/common/include/rte_pci.h	2013-05-29
> > > > 09:02:50.000000000 -0700 @@ -151,12 +151,11 @@ struct rte_pci_driver {
> > > > 
> > > >  	pci_devinit_t *devinit;                 /**< Device init. function.
> > > >  	*/ struct rte_pci_id *id_table;            /**< ID table, NULL
> > > >  	terminated. */ uint32_t drv_flags;                     /**< Flags
> > > >  	contolling handling of device. */
> > > > 
> > > > +	const char *module_name;		/**< Associated kernel module */
> > > 
> > > You solution only permits for one module to be checked during
> > > initialization, while the former solution using flags could be easily
> > > extended to check for more than one module.
> > > However it is true that there is a problem with this module check since
> > > it is historically hard-coded for "igb_uio".
> > 
> > It is one module per driver which works quite well.
> > The code identifies the module to load based on the PCI id.
> 
> There are cases where we need more than one kernel module for 2 reasons:
> 	- in this case, igb_uio is built and inserted without dependency on uio
> 	- some other PMDs/NICs, several modules without dependency links could be 
> needed
> 
> Please could you refactor your patch accordingly ?
> Thanks

There is nothing that says a PMD could load a second module, but why not put
that special case code in the the driver. Or make it a list of strings.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH 4/7] eal: support different modules
  2013-06-03 16:29       ` Thomas Monjalon
  2013-06-03 17:25         ` Stephen Hemminger
@ 2013-06-03 18:40         ` Antti Kantee
  1 sibling, 0 replies; 39+ messages in thread
From: Antti Kantee @ 2013-06-03 18:40 UTC (permalink / raw)
  To: dev

On 03.06.2013 18:29, Thomas Monjalon wrote:
>> Is breaking the API a bad thing in this context?  IMHO the
>> initialization APIs need work before they're general enough and
>> perpetually supporting the current ones seems like an unnecessary
>> burden.  I'm trying to understand the general guidelines of the project.
>>
>> (and nittily, recycling flag values is fine for static-only libs as long
>> as you remove the old macro, but of course removal is the API breakage
>> you mentioned)
>
> Yes, DPDK is a young project but breaking API should be always justified.
> In this case it is not mandatory to change it.

Ok, I was writing with the premise that Stephen's patch would be accepted.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH 4/7] eal: support different modules
  2013-06-03 17:26         ` Stephen Hemminger
@ 2013-06-04  9:17           ` Damien Millescamps
  0 siblings, 0 replies; 39+ messages in thread
From: Damien Millescamps @ 2013-06-04  9:17 UTC (permalink / raw)
  To: dev

[-- Attachment #1: Type: text/plain, Size: 703 bytes --]

On 06/03/2013 07:26 PM, Stephen Hemminger wrote:
> There is nothing that says a PMD could load a second module, but why not put
> that special case code in the the driver. Or make it a list of strings.
There is nothing that says anything about PMD and loading a single
module. However, the Getting Started Guide refers to "dymanic kernel
tweaks" citing "modules", all in a plural form:
/"Running an Intel®DPDK application requires some kernel configuration
customization (done at build time) and some dynamic kernel tweaks
(modules, procfs)./"

A list of strings can be implemented using a "null terminated array", as
suggested in my review. So yes, it could be a good solution.

-- 
Damien Millescamps

[-- Attachment #2: Type: text/html, Size: 1272 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH 1/7] [PATCH 4/8] igb: workaround errata with wthresh on 82576
  2013-05-30 17:12 ` [dpdk-dev] [PATCH 1/7] [PATCH 4/8] igb: workaround errata with wthresh on 82576 Stephen Hemminger
@ 2013-06-05 14:22   ` Vincent JARDIN
  2013-06-12 10:06     ` [dpdk-dev] [PATCH 1/7] " Thomas Monjalon
  0 siblings, 1 reply; 39+ messages in thread
From: Vincent JARDIN @ 2013-06-05 14:22 UTC (permalink / raw)
  To: dev

On 30/05/2013 19:12, Stephen Hemminger wrote:
> The 82576 has known issues which require the write threshold to be set to 1.
> See:
> 	http://download.intel.com/design/network/specupdt/82576_SPECUPDATE.pdf
>
> If not then single packets will hang in transmit ring until more arrive.
> Simple tests like ping will fail.
>
> Signed-off-by: Stephen Hemminger<shemminger@vyatta.com>
Reviewed-by: Vincent Jardin <vincent.jardin@6wind.com>

It looks good. Thank you.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH 2/7] rte_timer: optimize for empty case
  2013-05-30 17:12 ` [dpdk-dev] [PATCH 2/7] rte_timer: optimize for empty case Stephen Hemminger
@ 2013-06-05 14:25   ` Vincent JARDIN
  2013-06-12 10:07     ` Thomas Monjalon
  0 siblings, 1 reply; 39+ messages in thread
From: Vincent JARDIN @ 2013-06-05 14:25 UTC (permalink / raw)
  To: dev

On 30/05/2013 19:12, Stephen Hemminger wrote:
> In many application there are no timers queued, and the call to
> rte_timer_managecan be optimized in that case avoid reading HPET and
> lock overhead.
>
> Signed-off-by: Stephen Hemminger<shemminger@vyatta.com>
Reviewed-by: Vincent Jardin <vincent.jardin@6wind.com>

It is a nice to have for performance since HPET is quite slow.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH 3/7] optimize log/panic
  2013-05-30 17:12 ` [dpdk-dev] [PATCH 3/7] optimize log/panic Stephen Hemminger
@ 2013-06-05 14:34   ` Vincent JARDIN
  2013-06-12 10:09     ` Thomas Monjalon
  0 siblings, 1 reply; 39+ messages in thread
From: Vincent JARDIN @ 2013-06-05 14:34 UTC (permalink / raw)
  To: dev

On 30/05/2013 19:12, Stephen Hemminger wrote:
> Both logging and calls to panic are never in the critical path.
> Use the GCC attribute cold to mark these functions as cold,
> which generates more optimised code.
>
> Signed-off-by: Stephen Hemminger<shemminger@vyatta.com>
Reviewed-by: Vincent Jardin <vincent.jardin@6wind.com>

It does not hurt to move both to a cold section. TODO: some other init 
and setup functions should be cold too.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH 7/7] eal: add ability to override DPDK syslog parameters
  2013-05-30 17:12 ` [dpdk-dev] [PATCH 7/7] eal: add ability to override DPDK syslog parameters Stephen Hemminger
@ 2013-06-05 14:36   ` Vincent JARDIN
  2013-06-12 10:18     ` Thomas Monjalon
  0 siblings, 1 reply; 39+ messages in thread
From: Vincent JARDIN @ 2013-06-05 14:36 UTC (permalink / raw)
  To: dev

On 30/05/2013 19:12, Stephen Hemminger wrote:
> By default, DPDK based applications would only allow logging
> to syslog as "rte", DAEMON; but for any production application more
> control is desired to allow using actual application name and
> overriding the facility.
>
> Signed-off-by: Stephen Hemminger<shemminger@vyatta.com>
Reviewed-by: Vincent Jardin <vincent.jardin@6wind.com>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH 5/7] pci: support multiple PCI regions per device
  2013-05-30 17:12 ` [dpdk-dev] [PATCH 5/7] pci: support multiple PCI regions per device Stephen Hemminger
  2013-06-03 16:41   ` Thomas Monjalon
@ 2013-06-05 14:50   ` Damien Millescamps
  2013-06-05 15:49     ` Stephen Hemminger
  1 sibling, 1 reply; 39+ messages in thread
From: Damien Millescamps @ 2013-06-05 14:50 UTC (permalink / raw)
  To: dev

Hi Stephen,

Overall this patch is very nice. My only comment on this one is why do
you limit the max number of memory resources to 5 ?
The PCI configuration space permits to store up to 6 base addresses.

> +#define PCI_MEM_RESOURCE 5

Please, can you add a log/comment with your patch, too ?


Cheers,
-- 
Damien Millescamps

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH 5/7] pci: support multiple PCI regions per device
  2013-06-05 14:50   ` Damien Millescamps
@ 2013-06-05 15:49     ` Stephen Hemminger
  2013-06-05 18:05       ` Damien Millescamps
  0 siblings, 1 reply; 39+ messages in thread
From: Stephen Hemminger @ 2013-06-05 15:49 UTC (permalink / raw)
  To: Damien Millescamps; +Cc: dev

On Wed, 05 Jun 2013 16:50:03 +0200
Damien Millescamps <damien.millescamps@6wind.com> wrote:

> Hi Stephen,
> 
> Overall this patch is very nice. My only comment on this one is why do
> you limit the max number of memory resources to 5 ?
> The PCI configuration space permits to store up to 6 base addresses.
> 
> > +#define PCI_MEM_RESOURCE 5
> 
> Please, can you add a log/comment with your patch, too ?
> 
> 
> Cheers,

Only because I was trying to save some space, and I didn't see any hardware
with that many useful regions. Also the kernel UIO driver has some control
over which regions get exposed.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH 5/7] pci: support multiple PCI regions per device
  2013-06-05 15:49     ` Stephen Hemminger
@ 2013-06-05 18:05       ` Damien Millescamps
  2013-06-05 21:33         ` Stephen Hemminger
  0 siblings, 1 reply; 39+ messages in thread
From: Damien Millescamps @ 2013-06-05 18:05 UTC (permalink / raw)
  To: dev

On 06/05/2013 05:49 PM, Stephen Hemminger wrote:
> On Wed, 05 Jun 2013 16:50:03 +0200
> Damien Millescamps <damien.millescamps@6wind.com> wrote:
>
>> Hi Stephen,
>>
>> Overall this patch is very nice. My only comment on this one is why do
>> you limit the max number of memory resources to 5 ?
>> The PCI configuration space permits to store up to 6 base addresses.
>>
>>> +#define PCI_MEM_RESOURCE 5
>> Please, can you add a log/comment with your patch, too ?
>>
>>
>> Cheers,
> Only because I was trying to save some space, and I didn't see any hardware
> with that many useful regions. Also the kernel UIO driver has some control
> over which regions get exposed.

I agree that hardware generally don't use that much BAR for the PCIe.
However, this is only a matter of 20 to 24 Bytes, so I don't see any
reason not defining this macro as per the PCI standard value.

Could you add a commit log and change that so it can be ack'd and pushed
in the DPDK repository ?

Thanks,
-- 
Damien Millescamps

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH 5/7] pci: support multiple PCI regions per device
  2013-06-05 18:05       ` Damien Millescamps
@ 2013-06-05 21:33         ` Stephen Hemminger
  2013-06-18  1:28           ` somnath kotur
  0 siblings, 1 reply; 39+ messages in thread
From: Stephen Hemminger @ 2013-06-05 21:33 UTC (permalink / raw)
  To: Damien Millescamps; +Cc: dev

On Wed, 05 Jun 2013 20:05:15 +0200
Damien Millescamps <damien.millescamps@6wind.com> wrote:

> On 06/05/2013 05:49 PM, Stephen Hemminger wrote:
> > On Wed, 05 Jun 2013 16:50:03 +0200
> > Damien Millescamps <damien.millescamps@6wind.com> wrote:
> >
> >> Hi Stephen,
> >>
> >> Overall this patch is very nice. My only comment on this one is why do
> >> you limit the max number of memory resources to 5 ?
> >> The PCI configuration space permits to store up to 6 base addresses.
> >>
> >>> +#define PCI_MEM_RESOURCE 5
> >> Please, can you add a log/comment with your patch, too ?
> >>
> >>
> >> Cheers,
> > Only because I was trying to save some space, and I didn't see any hardware
> > with that many useful regions. Also the kernel UIO driver has some control
> > over which regions get exposed.
> 
> I agree that hardware generally don't use that much BAR for the PCIe.
> However, this is only a matter of 20 to 24 Bytes, so I don't see any
> reason not defining this macro as per the PCI standard value.
> 
> Could you add a commit log and change that so it can be ack'd and pushed
> in the DPDK repository ?
> 
> Thanks,

Go ahead and change the 5 to a 7.
No point in another resend of whole pile.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH 1/7] igb: workaround errata with wthresh on 82576
  2013-06-05 14:22   ` Vincent JARDIN
@ 2013-06-12 10:06     ` Thomas Monjalon
  0 siblings, 0 replies; 39+ messages in thread
From: Thomas Monjalon @ 2013-06-12 10:06 UTC (permalink / raw)
  To: dev

05/06/2013 16:22, Vincent JARDIN :
> Reviewed-by: Vincent Jardin <vincent.jardin@6wind.com>
> 
> It looks good. Thank you.

applied

-- 
Thomas

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH 2/7] rte_timer: optimize for empty case
  2013-06-05 14:25   ` Vincent JARDIN
@ 2013-06-12 10:07     ` Thomas Monjalon
  0 siblings, 0 replies; 39+ messages in thread
From: Thomas Monjalon @ 2013-06-12 10:07 UTC (permalink / raw)
  To: dev

05/06/2013 16:25, Vincent JARDIN :
> On 30/05/2013 19:12, Stephen Hemminger wrote:
> > Signed-off-by: Stephen Hemminger<shemminger@vyatta.com>
> 
> Reviewed-by: Vincent Jardin <vincent.jardin@6wind.com>
> 
> It is a nice to have for performance since HPET is quite slow.

applied

-- 
Thomas

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH 3/7] optimize log/panic
  2013-06-05 14:34   ` Vincent JARDIN
@ 2013-06-12 10:09     ` Thomas Monjalon
  0 siblings, 0 replies; 39+ messages in thread
From: Thomas Monjalon @ 2013-06-12 10:09 UTC (permalink / raw)
  To: dev

05/06/2013 16:34, Vincent JARDIN :
> On 30/05/2013 19:12, Stephen Hemminger wrote:
> > Signed-off-by: Stephen Hemminger<shemminger@vyatta.com>
> 
> Reviewed-by: Vincent Jardin <vincent.jardin@6wind.com>
> 
> It does not hurt to move both to a cold section. TODO: some other init
> and setup functions should be cold too.

applied with modified title:
log: optimize log/panic with attribute cold

-- 
Thomas

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH 7/7] eal: add ability to override DPDK syslog parameters
  2013-06-05 14:36   ` Vincent JARDIN
@ 2013-06-12 10:18     ` Thomas Monjalon
  0 siblings, 0 replies; 39+ messages in thread
From: Thomas Monjalon @ 2013-06-12 10:18 UTC (permalink / raw)
  To: dev

05/06/2013 16:36, Vincent JARDIN :
> On 30/05/2013 19:12, Stephen Hemminger wrote:
> > Signed-off-by: Stephen Hemminger<shemminger@vyatta.com>
> 
> Reviewed-by: Vincent Jardin <vincent.jardin@6wind.com>

applied with modified title:
	log: add ability to override syslog parameters

-- 
Thomas

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH 5/7] pci: support multiple PCI regions per device
  2013-06-05 21:33         ` Stephen Hemminger
@ 2013-06-18  1:28           ` somnath kotur
  2013-07-16  8:53             ` Thomas Monjalon
  0 siblings, 1 reply; 39+ messages in thread
From: somnath kotur @ 2013-06-18  1:28 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev

[-- Attachment #1: Type: text/plain, Size: 1480 bytes --]

Hi
 Any idea on the ETA for this patch? I'm presuming it's all good to be
pulled into the mainline otherwise?

Thanks
Som


On Thu, Jun 6, 2013 at 3:03 AM, Stephen Hemminger <
stephen@networkplumber.org> wrote:

> On Wed, 05 Jun 2013 20:05:15 +0200
> Damien Millescamps <damien.millescamps@6wind.com> wrote:
>
> > On 06/05/2013 05:49 PM, Stephen Hemminger wrote:
> > > On Wed, 05 Jun 2013 16:50:03 +0200
> > > Damien Millescamps <damien.millescamps@6wind.com> wrote:
> > >
> > >> Hi Stephen,
> > >>
> > >> Overall this patch is very nice. My only comment on this one is why do
> > >> you limit the max number of memory resources to 5 ?
> > >> The PCI configuration space permits to store up to 6 base addresses.
> > >>
> > >>> +#define PCI_MEM_RESOURCE 5
> > >> Please, can you add a log/comment with your patch, too ?
> > >>
> > >>
> > >> Cheers,
> > > Only because I was trying to save some space, and I didn't see any
> hardware
> > > with that many useful regions. Also the kernel UIO driver has some
> control
> > > over which regions get exposed.
> >
> > I agree that hardware generally don't use that much BAR for the PCIe.
> > However, this is only a matter of 20 to 24 Bytes, so I don't see any
> > reason not defining this macro as per the PCI standard value.
> >
> > Could you add a commit log and change that so it can be ack'd and pushed
> > in the DPDK repository ?
> >
> > Thanks,
>
> Go ahead and change the 5 to a 7.
> No point in another resend of whole pile.
>

[-- Attachment #2: Type: text/html, Size: 2276 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH 5/7] pci: support multiple PCI regions per device
  2013-06-18  1:28           ` somnath kotur
@ 2013-07-16  8:53             ` Thomas Monjalon
  2013-07-19 16:44               ` Stephen Hemminger
  0 siblings, 1 reply; 39+ messages in thread
From: Thomas Monjalon @ 2013-07-16  8:53 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev

Hello,

18/06/2013 03:28, somnath kotur :
>  Any idea on the ETA for this patch? I'm presuming it's all good to be
> pulled into the mainline otherwise?

We are waiting for a commit log.
Please Stephen, could you fix the number of PCI_MEM_RESOURCE and provide a 
commit log ?

Thank you
-- 
Thomas

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [dpdk-dev] [PATCH 5/7] pci: support multiple PCI regions per device
  2013-07-16  8:53             ` Thomas Monjalon
@ 2013-07-19 16:44               ` Stephen Hemminger
  0 siblings, 0 replies; 39+ messages in thread
From: Stephen Hemminger @ 2013-07-19 16:44 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

Intel incorporated this into later version of DPDK (not released yet),
let me respin the patch so that the API comes out the same. I don't want
to cause pain on the assumption that the open and Intel versions will want
to stay converged.

Also, want to support I/O regions which are necessary when doing virtio
devices natively.

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2013-07-19 16:44 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-30 17:12 [dpdk-dev] [PATCH 0/7] Vyatta patches Stephen Hemminger
2013-05-30 17:12 ` [dpdk-dev] [PATCH 1/7] [PATCH 4/8] igb: workaround errata with wthresh on 82576 Stephen Hemminger
2013-06-05 14:22   ` Vincent JARDIN
2013-06-12 10:06     ` [dpdk-dev] [PATCH 1/7] " Thomas Monjalon
2013-05-30 17:12 ` [dpdk-dev] [PATCH 2/7] rte_timer: optimize for empty case Stephen Hemminger
2013-06-05 14:25   ` Vincent JARDIN
2013-06-12 10:07     ` Thomas Monjalon
2013-05-30 17:12 ` [dpdk-dev] [PATCH 3/7] optimize log/panic Stephen Hemminger
2013-06-05 14:34   ` Vincent JARDIN
2013-06-12 10:09     ` Thomas Monjalon
2013-05-30 17:12 ` [dpdk-dev] [PATCH 4/7] eal: support different modules Stephen Hemminger
2013-06-03  8:58   ` Damien Millescamps
2013-06-03 15:41     ` Stephen Hemminger
2013-06-03 16:36       ` Thomas Monjalon
2013-06-03 17:26         ` Stephen Hemminger
2013-06-04  9:17           ` Damien Millescamps
2013-06-03 16:08     ` Antti Kantee
2013-06-03 16:29       ` Thomas Monjalon
2013-06-03 17:25         ` Stephen Hemminger
2013-06-03 18:40         ` Antti Kantee
2013-05-30 17:12 ` [dpdk-dev] [PATCH 5/7] pci: support multiple PCI regions per device Stephen Hemminger
2013-06-03 16:41   ` Thomas Monjalon
2013-06-05 14:50   ` Damien Millescamps
2013-06-05 15:49     ` Stephen Hemminger
2013-06-05 18:05       ` Damien Millescamps
2013-06-05 21:33         ` Stephen Hemminger
2013-06-18  1:28           ` somnath kotur
2013-07-16  8:53             ` Thomas Monjalon
2013-07-19 16:44               ` Stephen Hemminger
2013-05-30 17:12 ` [dpdk-dev] [PATCH 6/7] igb_uio: pci_block_user_cfg_access is unsafe, remove it Stephen Hemminger
2013-05-30 17:12 ` [dpdk-dev] [PATCH 7/7] eal: add ability to override DPDK syslog parameters Stephen Hemminger
2013-06-05 14:36   ` Vincent JARDIN
2013-06-12 10:18     ` Thomas Monjalon
2013-05-30 22:20 ` [dpdk-dev] [PATCH 0/7] Vyatta patches Thomas Monjalon
2013-05-31  9:29 ` Damien Millescamps
2013-05-31 15:45   ` Stephen Hemminger
2013-05-31 16:44     ` Damien Millescamps
2013-05-31 17:00       ` Stephen Hemminger
2013-06-03 15:22 ` Thomas Monjalon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).