DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH v1 00/15] support multi-pthread per core
       [not found] <1417589628-43666-1-git-send-email-cunming.liang@intel.com>
@ 2015-01-22  8:16 ` Cunming Liang
  2015-01-22  8:16   ` [dpdk-dev] [PATCH v1 01/15] eal: add cpuset into per EAL thread lcore_config Cunming Liang
                     ` (16 more replies)
  0 siblings, 17 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-22  8:16 UTC (permalink / raw)
  To: dev

The patch series contain the enhancements of EAL and fixes for libraries
to run multi-pthreads(either EAL or non-EAL thread) per physical core. 
Two major changes list as below:
- Extend the core affinity of each EAL thread to 1:n.
  Each lcore stands for a EAL thread rather than a logical core.
  The change adds new EAL option to allow static lcore to cpuset assginment.
  Then a lcore(EAL thread) affinity to a cpuset, original 1:1 mapping is the special case.
- Fix the libraries to allow running on any non-EAL thread.
  It fix the gaps running libraries in non-EAL thread(dynamic created by user).
  Each fix libraries take care the case of rte_lcore_id() >= RTE_MAX_LCORE.
  
Thanks a million for the comments from Konstantin, Bruce, Mirek and Stephen in RFC review.

*** BLURB HERE ***

Cunming Liang (15):
  eal: add cpuset into per EAL thread lcore_config
  eal: new eal option '--lcores' for cpu assignment
  eal: add support parsing socket_id from cpuset
  eal: new TLS definition and API declaration
  eal: add eal_common_thread.c for common thread API
  eal: add rte_gettid() to acquire unique system tid
  eal: apply affinity of EAL thread by assigned cpuset
  enic: fix re-define freebsd compile complain
  malloc: fix the issue of SOCKET_ID_ANY
  log: fix the gap to support non-EAL thread
  eal: set _lcore_id and _socket_id to (-1) by default
  eal: fix recursive spinlock in non-EAL thraed
  mempool: add support to non-EAL thread
  ring: add support to non-EAL thread
  timer: add support to non-EAL thread

 lib/librte_eal/bsdapp/eal/Makefile                 |   1 +
 lib/librte_eal/bsdapp/eal/eal.c                    |  13 +-
 lib/librte_eal/bsdapp/eal/eal_lcore.c              |  14 ++
 lib/librte_eal/bsdapp/eal/eal_memory.c             |   2 +
 lib/librte_eal/bsdapp/eal/eal_thread.c             |  76 +++---
 lib/librte_eal/common/eal_common_launch.c          |   1 -
 lib/librte_eal/common/eal_common_log.c             |  17 +-
 lib/librte_eal/common/eal_common_options.c         | 262 ++++++++++++++++++++-
 lib/librte_eal/common/eal_common_thread.c          | 142 +++++++++++
 lib/librte_eal/common/eal_options.h                |   2 +
 lib/librte_eal/common/eal_thread.h                 |  66 ++++++
 .../common/include/generic/rte_spinlock.h          |   4 +-
 lib/librte_eal/common/include/rte_eal.h            |  27 +++
 lib/librte_eal/common/include/rte_lcore.h          |  37 ++-
 lib/librte_eal/common/include/rte_log.h            |   5 +
 lib/librte_eal/linuxapp/eal/Makefile               |   4 +
 lib/librte_eal/linuxapp/eal/eal.c                  |   7 +-
 lib/librte_eal/linuxapp/eal/eal_lcore.c            |  15 ++
 lib/librte_eal/linuxapp/eal/eal_thread.c           |  78 +++---
 lib/librte_malloc/malloc_heap.h                    |   7 +-
 lib/librte_mempool/rte_mempool.h                   |  18 +-
 lib/librte_pmd_enic/enic.h                         |   1 +
 lib/librte_pmd_enic/enic_compat.h                  |   1 +
 lib/librte_ring/rte_ring.h                         |  10 +-
 lib/librte_timer/rte_timer.c                       |  40 +++-
 lib/librte_timer/rte_timer.h                       |   2 +-
 26 files changed, 721 insertions(+), 131 deletions(-)
 create mode 100644 lib/librte_eal/common/eal_common_thread.c

-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v1 01/15] eal: add cpuset into per EAL thread lcore_config
  2015-01-22  8:16 ` [dpdk-dev] [PATCH v1 00/15] support multi-pthread per core Cunming Liang
@ 2015-01-22  8:16   ` Cunming Liang
  2015-01-22  8:16   ` [dpdk-dev] [PATCH v1 02/15] eal: new eal option '--lcores' for cpu assignment Cunming Liang
                     ` (15 subsequent siblings)
  16 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-22  8:16 UTC (permalink / raw)
  To: dev

The patch adds 'cpuset' into per-lcore configure 'lcore_config[]',
as the lcore no longer always 1:1 pinning with physical cpu.
The lcore now stands for a EAL thread rather than a logical cpu.

It doesn't change the default behavior of 1:1 mapping, but allows to
affinity the EAL thread to multiple cpus.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/bsdapp/eal/eal_lcore.c     | 7 +++++++
 lib/librte_eal/bsdapp/eal/eal_memory.c    | 2 ++
 lib/librte_eal/common/include/rte_lcore.h | 8 ++++++++
 lib/librte_eal/linuxapp/eal/Makefile      | 1 +
 lib/librte_eal/linuxapp/eal/eal_lcore.c   | 8 ++++++++
 5 files changed, 26 insertions(+)

diff --git a/lib/librte_eal/bsdapp/eal/eal_lcore.c b/lib/librte_eal/bsdapp/eal/eal_lcore.c
index 662f024..72f8ac2 100644
--- a/lib/librte_eal/bsdapp/eal/eal_lcore.c
+++ b/lib/librte_eal/bsdapp/eal/eal_lcore.c
@@ -76,11 +76,18 @@ rte_eal_cpu_init(void)
 	 * ones and enable them by default.
 	 */
 	for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
+		/* init cpuset for per lcore config */
+		CPU_ZERO(&lcore_config[lcore_id].cpuset);
+
 		lcore_config[lcore_id].detected = (lcore_id < ncpus);
 		if (lcore_config[lcore_id].detected == 0) {
 			config->lcore_role[lcore_id] = ROLE_OFF;
 			continue;
 		}
+
+		/* By default, lcore 1:1 map to cpu id */
+		CPU_SET(lcore_id, &lcore_config[lcore_id].cpuset);
+
 		/* By default, each detected core is enabled */
 		config->lcore_role[lcore_id] = ROLE_RTE;
 		lcore_config[lcore_id].core_id = cpu_core_id(lcore_id);
diff --git a/lib/librte_eal/bsdapp/eal/eal_memory.c b/lib/librte_eal/bsdapp/eal/eal_memory.c
index 65ee87d..a34d500 100644
--- a/lib/librte_eal/bsdapp/eal/eal_memory.c
+++ b/lib/librte_eal/bsdapp/eal/eal_memory.c
@@ -45,6 +45,8 @@
 #include "eal_internal_cfg.h"
 #include "eal_filesystem.h"
 
+/* avoid re-defined against with freebsd header */
+#undef PAGE_SIZE
 #define PAGE_SIZE (sysconf(_SC_PAGESIZE))
 
 /*
diff --git a/lib/librte_eal/common/include/rte_lcore.h b/lib/librte_eal/common/include/rte_lcore.h
index 49b2c03..4c7d6bb 100644
--- a/lib/librte_eal/common/include/rte_lcore.h
+++ b/lib/librte_eal/common/include/rte_lcore.h
@@ -50,6 +50,13 @@ extern "C" {
 
 #define LCORE_ID_ANY -1    /**< Any lcore. */
 
+#if defined(__linux__)
+	typedef	cpu_set_t rte_cpuset_t;
+#elif defined(__FreeBSD__)
+#include <pthread_np.h>
+	typedef cpuset_t rte_cpuset_t;
+#endif
+
 /**
  * Structure storing internal configuration (per-lcore)
  */
@@ -65,6 +72,7 @@ struct lcore_config {
 	unsigned socket_id;        /**< physical socket id for this lcore */
 	unsigned core_id;          /**< core number on socket for this lcore */
 	int core_index;            /**< relative index, starting from 0 */
+	rte_cpuset_t cpuset;       /**< cpu set which the lcore affinity to */
 };
 
 /**
diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile
index 72ecf3a..0e9c447 100644
--- a/lib/librte_eal/linuxapp/eal/Makefile
+++ b/lib/librte_eal/linuxapp/eal/Makefile
@@ -87,6 +87,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_common_dev.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_common_options.c
 
 CFLAGS_eal.o := -D_GNU_SOURCE
+CFLAGS_eal_lcore.o := -D_GNU_SOURCE
 CFLAGS_eal_thread.o := -D_GNU_SOURCE
 CFLAGS_eal_log.o := -D_GNU_SOURCE
 CFLAGS_eal_common_log.o := -D_GNU_SOURCE
diff --git a/lib/librte_eal/linuxapp/eal/eal_lcore.c b/lib/librte_eal/linuxapp/eal/eal_lcore.c
index c67e0e6..29615f8 100644
--- a/lib/librte_eal/linuxapp/eal/eal_lcore.c
+++ b/lib/librte_eal/linuxapp/eal/eal_lcore.c
@@ -158,11 +158,19 @@ rte_eal_cpu_init(void)
 	 * ones and enable them by default.
 	 */
 	for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
+		/* init cpuset for per lcore config */
+		CPU_ZERO(&lcore_config[lcore_id].cpuset);
+
+		/* in 1:1 mapping, record related cpu detected state */
 		lcore_config[lcore_id].detected = cpu_detected(lcore_id);
 		if (lcore_config[lcore_id].detected == 0) {
 			config->lcore_role[lcore_id] = ROLE_OFF;
 			continue;
 		}
+
+		/* By default, lcore 1:1 map to cpu id */
+		CPU_SET(lcore_id, &lcore_config[lcore_id].cpuset);
+
 		/* By default, each detected core is enabled */
 		config->lcore_role[lcore_id] = ROLE_RTE;
 		lcore_config[lcore_id].core_id = cpu_core_id(lcore_id);
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v1 02/15] eal: new eal option '--lcores' for cpu assignment
  2015-01-22  8:16 ` [dpdk-dev] [PATCH v1 00/15] support multi-pthread per core Cunming Liang
  2015-01-22  8:16   ` [dpdk-dev] [PATCH v1 01/15] eal: add cpuset into per EAL thread lcore_config Cunming Liang
@ 2015-01-22  8:16   ` Cunming Liang
  2015-01-22 12:19     ` Bruce Richardson
  2015-01-22  8:16   ` [dpdk-dev] [PATCH v1 03/15] eal: add support parsing socket_id from cpuset Cunming Liang
                     ` (14 subsequent siblings)
  16 siblings, 1 reply; 253+ messages in thread
From: Cunming Liang @ 2015-01-22  8:16 UTC (permalink / raw)
  To: dev

It supports one new eal long option '--lcores' for EAL thread cpuset assignment.

The format pattern:
	--lcores='lcores[@cpus]<,lcores[@cpus]>'
lcores, cpus could be a single digit or a group.
'(' and ')' are necessary if it's a group.
If not supply '@cpus', the value of cpus uses the same as lcores.

e.g. '1,2@(5-7),(3-5)@(0,2),(0,6)' means starting 7 EAL thread as below
  lcore 0 runs on cpuset 0x41 (cpu 0,6)
  lcore 1 runs on cpuset 0x2 (cpu 1)
  lcore 2 runs on cpuset 0xe0 (cpu 5,6,7)
  lcore 3,4,5 runs on cpuset 0x5 (cpu 0,2)
  lcore 6 runs on cpuset 0x41 (cpu 0,6)

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/common/eal_common_launch.c  |   1 -
 lib/librte_eal/common/eal_common_options.c | 262 ++++++++++++++++++++++++++++-
 lib/librte_eal/common/eal_options.h        |   2 +
 lib/librte_eal/linuxapp/eal/Makefile       |   1 +
 4 files changed, 261 insertions(+), 5 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_launch.c b/lib/librte_eal/common/eal_common_launch.c
index 599f83b..2d732b1 100644
--- a/lib/librte_eal/common/eal_common_launch.c
+++ b/lib/librte_eal/common/eal_common_launch.c
@@ -117,4 +117,3 @@ rte_eal_mp_wait_lcore(void)
 		rte_eal_wait_lcore(lcore_id);
 	}
 }
-
diff --git a/lib/librte_eal/common/eal_common_options.c b/lib/librte_eal/common/eal_common_options.c
index e2810ab..fc47588 100644
--- a/lib/librte_eal/common/eal_common_options.c
+++ b/lib/librte_eal/common/eal_common_options.c
@@ -45,6 +45,7 @@
 #include <rte_lcore.h>
 #include <rte_version.h>
 #include <rte_devargs.h>
+#include <rte_memcpy.h>
 
 #include "eal_internal_cfg.h"
 #include "eal_options.h"
@@ -85,6 +86,7 @@ eal_long_options[] = {
 	{OPT_XEN_DOM0, 0, 0, OPT_XEN_DOM0_NUM},
 	{OPT_CREATE_UIO_DEV, 1, NULL, OPT_CREATE_UIO_DEV_NUM},
 	{OPT_VFIO_INTR, 1, NULL, OPT_VFIO_INTR_NUM},
+	{OPT_LCORES, 1, 0, OPT_LCORES_NUM},
 	{0, 0, 0, 0}
 };
 
@@ -255,9 +257,11 @@ eal_parse_corelist(const char *corelist)
 			if (min == RTE_MAX_LCORE)
 				min = idx;
 			for (idx = min; idx <= max; idx++) {
-				cfg->lcore_role[idx] = ROLE_RTE;
-				lcore_config[idx].core_index = count;
-				count++;
+				if (cfg->lcore_role[idx] != ROLE_RTE) {
+					cfg->lcore_role[idx] = ROLE_RTE;
+					lcore_config[idx].core_index = count;
+					count++;
+				}
 			}
 			min = RTE_MAX_LCORE;
 		} else
@@ -289,6 +293,241 @@ eal_parse_master_lcore(const char *arg)
 	return 0;
 }
 
+/*
+ * Parse elem, the elem could be single number or '(' ')' group
+ * Within group elem, '-' used for a range seperator;
+ *                    ',' used for a single number.
+ */
+static int
+eal_parse_set(const char *input, uint16_t set[], unsigned num)
+{
+	unsigned idx;
+	const char *str = input;
+	char *end = NULL;
+	unsigned min, max;
+
+	memset(set, 0, num * sizeof(uint16_t));
+
+	while (isblank(*str))
+		str++;
+
+	/* only digit or left bracket is qulify for start point */
+	if ((!isdigit(*str) && *str != '(') || *str == '\0')
+		return -1;
+
+	/* process single number */
+	if (*str != '(') {
+		errno = 0;
+		idx = strtoul(str, &end, 10);
+		if (errno || end == NULL || idx >= num)
+			return -1;
+		else {
+			while (isblank(*end))
+				end++;
+
+			if (*end != ',' && *end != '\0' &&
+			    *end != '@')
+				return -1;
+
+			set[idx] = 1;
+			return end - input;
+		}
+	}
+
+	/* process set within bracket */
+	str++;
+	while (isblank(*str))
+		str++;
+	if (*str == '\0')
+		return -1;
+
+	min = RTE_MAX_LCORE;
+	do {
+
+		/* go ahead to the first digit */
+		while (isblank(*str))
+			str++;
+		if (!isdigit(*str))
+			return -1;
+
+		/* get the digit value */
+		errno = 0;
+		idx = strtoul(str, &end, 10);
+		if (errno || end == NULL || idx >= num)
+			return -1;
+
+		/* go ahead to separator '-',',' and ')' */
+		while (isblank(*end))
+			end++;
+		if (*end == '-') {
+			if (min == RTE_MAX_LCORE)
+				min = idx;
+			else /* avoid continuous '-' */
+				return -1;
+		} else if ((*end == ',') || (*end == ')')) {
+			max = idx;
+			if (min == RTE_MAX_LCORE)
+				min = idx;
+			for (idx = RTE_MIN(min, max);
+			     idx <= RTE_MAX(min, max); idx++)
+				set[idx] = 1;
+
+			min = RTE_MAX_LCORE;
+		} else
+			return -1;
+
+		str = end + 1;
+	} while (*end != '\0' && *end != ')');
+
+	return str - input;
+}
+
+/* convert from set array to cpuset bitmap */
+static inline int
+convert_to_cpuset(rte_cpuset_t *cpusetp,
+	      uint16_t *set, unsigned num)
+{
+	unsigned idx;
+
+	CPU_ZERO(cpusetp);
+
+	for (idx = 0; idx < num; idx++) {
+		if (!set[idx])
+			continue;
+
+		if (!lcore_config[idx].detected) {
+			RTE_LOG(ERR, EAL, "core %u "
+				"unavailable\n", idx);
+			return -1;
+		}
+
+		CPU_SET(idx, cpusetp);
+	}
+
+	return 0;
+}
+
+/*
+ * The format pattern: --lcores='lcores[@cpus]<,lcores[@cpus]>'
+ * lcores, cpus could be a single digit or a group.
+ * '(' and ')' are necessary if it's a group.
+ * If not supply '@cpus', the value of cpus uses the same as lcores.
+ * e.g. '1,2@(5-7),(3-5)@(0,2),(0,6)' means start 7 EAL thread as below
+ *   lcore 0 runs on cpuset 0x41 (cpu 0,6)
+ *   lcore 1 runs on cpuset 0x2 (cpu 1)
+ *   lcore 2 runs on cpuset 0xe0 (cpu 5,6,7)
+ *   lcore 3,4,5 runs on cpuset 0x5 (cpu 0,2)
+ *   lcore 6 runs on cpuset 0x41 (cpu 0,6)
+ */
+static int
+eal_parse_lcores(const char *lcores)
+{
+	struct rte_config *cfg = rte_eal_get_configuration();
+	static uint16_t set[RTE_MAX_LCORE];
+	unsigned idx = 0;
+	int i;
+	unsigned count = 0;
+	const char *lcore_start = NULL;
+	const char *end = NULL;
+	int offset;
+	rte_cpuset_t cpuset;
+	int ret = -1;
+
+	if (lcores == NULL)
+		return -1;
+
+	/* Remove all blank characters ahead and after */
+	while (isblank(*lcores))
+		lcores++;
+	i = strnlen(lcores, sysconf(_SC_ARG_MAX));
+	while ((i > 0) && isblank(lcores[i - 1]))
+		i--;
+
+	CPU_ZERO(&cpuset);
+
+	/* Reset lcore config */
+	for (idx = 0; idx < RTE_MAX_LCORE; idx++) {
+		cfg->lcore_role[idx] = ROLE_OFF;
+		lcore_config[idx].core_index = -1;
+		CPU_ZERO(&lcore_config[idx].cpuset);
+	}
+
+	/* Get list of cores */
+	do {
+		while (isblank(*lcores))
+			lcores++;
+		if (*lcores == '\0')
+			goto err;
+
+		/* record lcore_set start point */
+		lcore_start = lcores;
+
+		/* go across a complete bracket */
+		if (*lcore_start == '(') {
+			lcores += strcspn(lcores, ")");
+			if (*lcores++ == '\0')
+				goto err;
+		}
+
+		/* scan the separator '@', ','(next) or '\0'(finish) */
+		lcores += strcspn(lcores, "@,");
+
+		if (*lcores == '@') {
+			/* explict assign cpu_set */
+			offset = eal_parse_set(lcores + 1, set, RTE_DIM(set));
+			if (offset < 0)
+				goto err;
+
+			/* prepare cpu_set and update the end cursor */
+			if (0 > convert_to_cpuset(&cpuset,
+						  set, RTE_DIM(set)))
+				goto err;
+			end = lcores + 1 + offset;
+		} else  /* ',' or '\0' */
+			/* haven't given cpu_set, current loop done */
+			end = lcores;
+
+		if (*end != ',' && *end != '\0')
+			goto err;
+
+		/* parse lcore_set from start point */
+		if (0 > eal_parse_set(lcore_start, set, RTE_DIM(set)))
+			goto err;
+
+		/* without '@', by default using lcore_set as cpu_set */
+		if (*lcores != '@' &&
+		    0 > convert_to_cpuset(&cpuset, set, RTE_DIM(set)))
+			goto err;
+
+		/* start to update lcore_set */
+		for (idx = 0; idx < RTE_MAX_LCORE; idx++) {
+			if (!set[idx])
+				continue;
+
+			if (cfg->lcore_role[idx] != ROLE_RTE) {
+				lcore_config[idx].core_index = count;
+				cfg->lcore_role[idx] = ROLE_RTE;
+				count++;
+			}
+			rte_memcpy(&lcore_config[idx].cpuset, &cpuset,
+				   sizeof(rte_cpuset_t));
+		}
+
+		lcores = end + 1;
+	} while (*end != '\0');
+
+	if (count == 0)
+		goto err;
+
+	cfg->lcore_count = count;
+	lcores_parsed = 1;
+	ret = 0;
+
+err:
+
+	return ret;
+}
+
 static int
 eal_parse_syslog(const char *facility, struct internal_config *conf)
 {
@@ -489,6 +728,13 @@ eal_parse_common_option(int opt, const char *optarg,
 		conf->log_level = log;
 		break;
 	}
+	case OPT_LCORES_NUM:
+		if (eal_parse_lcores(optarg) < 0) {
+			RTE_LOG(ERR, EAL, "invalid parameter for --"
+				OPT_LCORES "\n");
+			return -1;
+		}
+		break;
 
 	/* don't know what to do, leave this to caller */
 	default:
@@ -527,7 +773,7 @@ eal_check_common_options(struct internal_config *internal_cfg)
 
 	if (!lcores_parsed) {
 		RTE_LOG(ERR, EAL, "CPU cores must be enabled with options "
-			"-c or -l\n");
+			"-c, -l or --lcores\n");
 		return -1;
 	}
 	if (cfg->lcore_role[cfg->master_lcore] != ROLE_RTE) {
@@ -583,6 +829,14 @@ eal_common_usage(void)
 	       "                 The argument format is <c1>[-c2][,c3[-c4],...]\n"
 	       "                 where c1, c2, etc are core indexes between 0 and %d\n"
 	       "  --"OPT_MASTER_LCORE" ID: Core ID that is used as master\n"
+	       "  --"OPT_LCORES" MAP: maps between lcore_set to phys_cpu_set\n"
+	       "                 The argument format is\n"
+	       "                       'lcores[@cpus]<,lcores[@cpus],...>'\n"
+	       "                 lcores and cpus list are grouped by '(' and ')'\n"
+	       "                 Within the group, '-' is used for range separator,\n"
+	       "                 ',' is used for single number separator.\n"
+	       "                 '( )' can be omitted for single element group, '@' \n"
+	       "                 can be omitted if cpus and lcores has the same value\n"
 	       "  -n NUM       : Number of memory channels\n"
 	       "  -v           : Display version information on startup\n"
 	       "  -m MB        : memory to allocate (see also --"OPT_SOCKET_MEM")\n"
diff --git a/lib/librte_eal/common/eal_options.h b/lib/librte_eal/common/eal_options.h
index e476f8d..a1cc59f 100644
--- a/lib/librte_eal/common/eal_options.h
+++ b/lib/librte_eal/common/eal_options.h
@@ -77,6 +77,8 @@ enum {
 	OPT_CREATE_UIO_DEV_NUM,
 #define OPT_VFIO_INTR    "vfio-intr"
 	OPT_VFIO_INTR_NUM,
+#define OPT_LCORES "lcores"
+	OPT_LCORES_NUM,
 	OPT_LONG_MAX_NUM
 };
 
diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile
index 0e9c447..025d836 100644
--- a/lib/librte_eal/linuxapp/eal/Makefile
+++ b/lib/librte_eal/linuxapp/eal/Makefile
@@ -95,6 +95,7 @@ CFLAGS_eal_hugepage_info.o := -D_GNU_SOURCE
 CFLAGS_eal_pci.o := -D_GNU_SOURCE
 CFLAGS_eal_pci_vfio.o := -D_GNU_SOURCE
 CFLAGS_eal_common_whitelist.o := -D_GNU_SOURCE
+CFLAGS_eal_common_options.o := -D_GNU_SOURCE
 
 # workaround for a gcc bug with noreturn attribute
 # http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v1 03/15] eal: add support parsing socket_id from cpuset
  2015-01-22  8:16 ` [dpdk-dev] [PATCH v1 00/15] support multi-pthread per core Cunming Liang
  2015-01-22  8:16   ` [dpdk-dev] [PATCH v1 01/15] eal: add cpuset into per EAL thread lcore_config Cunming Liang
  2015-01-22  8:16   ` [dpdk-dev] [PATCH v1 02/15] eal: new eal option '--lcores' for cpu assignment Cunming Liang
@ 2015-01-22  8:16   ` Cunming Liang
  2015-01-22  8:16   ` [dpdk-dev] [PATCH v1 04/15] eal: new TLS definition and API declaration Cunming Liang
                     ` (13 subsequent siblings)
  16 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-22  8:16 UTC (permalink / raw)
  To: dev

It returns the socket_id if all cpus in the cpuset belongs
to the same NUMA node, otherwise it will return SOCKET_ID_ANY.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/bsdapp/eal/eal_lcore.c   |  7 +++++
 lib/librte_eal/common/eal_thread.h      | 52 +++++++++++++++++++++++++++++++++
 lib/librte_eal/linuxapp/eal/eal_lcore.c |  7 +++++
 3 files changed, 66 insertions(+)

diff --git a/lib/librte_eal/bsdapp/eal/eal_lcore.c b/lib/librte_eal/bsdapp/eal/eal_lcore.c
index 72f8ac2..162fb4f 100644
--- a/lib/librte_eal/bsdapp/eal/eal_lcore.c
+++ b/lib/librte_eal/bsdapp/eal/eal_lcore.c
@@ -41,6 +41,7 @@
 #include <rte_debug.h>
 
 #include "eal_private.h"
+#include "eal_thread.h"
 
 /* No topology information available on FreeBSD including NUMA info */
 #define cpu_core_id(X) 0
@@ -112,3 +113,9 @@ rte_eal_cpu_init(void)
 
 	return 0;
 }
+
+unsigned
+eal_cpu_socket_id(__rte_unused unsigned cpu_id)
+{
+	return cpu_socket_id(cpu_id);
+}
diff --git a/lib/librte_eal/common/eal_thread.h b/lib/librte_eal/common/eal_thread.h
index b53b84d..a25ee86 100644
--- a/lib/librte_eal/common/eal_thread.h
+++ b/lib/librte_eal/common/eal_thread.h
@@ -34,6 +34,10 @@
 #ifndef EAL_THREAD_H
 #define EAL_THREAD_H
 
+#include <sched.h>
+
+#include <rte_debug.h>
+
 /**
  * basic loop of thread, called for each thread by eal_init().
  *
@@ -50,4 +54,52 @@ __attribute__((noreturn)) void *eal_thread_loop(void *arg);
  */
 void eal_thread_init_master(unsigned lcore_id);
 
+/**
+ * Get the NUMA socket id from cpu id.
+ * This function is private to EAL.
+ *
+ * @param cpu_id
+ *   The logical process id.
+ * @return
+ *   socket_id or SOCKET_ID_ANY
+ */
+unsigned eal_cpu_socket_id(unsigned cpu_id);
+
+/**
+ * Get the NUMA socket id from cpuset.
+ * This function is private to EAL.
+ *
+ * @param cpusetp
+ *   The point to a valid cpu set.
+ * @return
+ *   socket_id or SOCKET_ID_ANY
+ */
+static inline int
+eal_cpuset_socket_id(rte_cpuset_t *cpusetp)
+{
+	unsigned cpu = 0;
+	int socket_id = SOCKET_ID_ANY;
+	int sid;
+
+	if (cpusetp == NULL)
+		return SOCKET_ID_ANY;
+
+	do {
+		if (!CPU_ISSET(cpu, cpusetp))
+			continue;
+
+		if (socket_id == SOCKET_ID_ANY)
+			socket_id = eal_cpu_socket_id(cpu);
+
+		sid = eal_cpu_socket_id(cpu);
+		if (socket_id != sid) {
+			socket_id = SOCKET_ID_ANY;
+			break;
+		}
+
+	} while (++cpu < RTE_MAX_LCORE);
+
+	return socket_id;
+}
+
 #endif /* EAL_THREAD_H */
diff --git a/lib/librte_eal/linuxapp/eal/eal_lcore.c b/lib/librte_eal/linuxapp/eal/eal_lcore.c
index 29615f8..922af6d 100644
--- a/lib/librte_eal/linuxapp/eal/eal_lcore.c
+++ b/lib/librte_eal/linuxapp/eal/eal_lcore.c
@@ -45,6 +45,7 @@
 
 #include "eal_private.h"
 #include "eal_filesystem.h"
+#include "eal_thread.h"
 
 #define SYS_CPU_DIR "/sys/devices/system/cpu/cpu%u"
 #define CORE_ID_FILE "topology/core_id"
@@ -197,3 +198,9 @@ rte_eal_cpu_init(void)
 
 	return 0;
 }
+
+unsigned
+eal_cpu_socket_id(unsigned cpu_id)
+{
+	return cpu_socket_id(cpu_id);
+}
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v1 04/15] eal: new TLS definition and API declaration
  2015-01-22  8:16 ` [dpdk-dev] [PATCH v1 00/15] support multi-pthread per core Cunming Liang
                     ` (2 preceding siblings ...)
  2015-01-22  8:16   ` [dpdk-dev] [PATCH v1 03/15] eal: add support parsing socket_id from cpuset Cunming Liang
@ 2015-01-22  8:16   ` Cunming Liang
  2015-01-22  8:16   ` [dpdk-dev] [PATCH v1 05/15] eal: add eal_common_thread.c for common thread API Cunming Liang
                     ` (12 subsequent siblings)
  16 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-22  8:16 UTC (permalink / raw)
  To: dev

1. add two TLS *_socket_id* and *_cpuset*
2. add two external API rte_thread_set/get_affinity
3. add one internal API eal_thread_dump_affinity

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/bsdapp/eal/eal_thread.c    |  2 ++
 lib/librte_eal/common/eal_thread.h        | 14 ++++++++++++++
 lib/librte_eal/common/include/rte_lcore.h | 29 +++++++++++++++++++++++++++--
 lib/librte_eal/linuxapp/eal/eal_thread.c  |  2 ++
 4 files changed, 45 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c b/lib/librte_eal/bsdapp/eal/eal_thread.c
index ab05368..10220c7 100644
--- a/lib/librte_eal/bsdapp/eal/eal_thread.c
+++ b/lib/librte_eal/bsdapp/eal/eal_thread.c
@@ -56,6 +56,8 @@
 #include "eal_thread.h"
 
 RTE_DEFINE_PER_LCORE(unsigned, _lcore_id);
+RTE_DEFINE_PER_LCORE(unsigned, _socket_id);
+RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset);
 
 /*
  * Send a message to a slave lcore identified by slave_id to call a
diff --git a/lib/librte_eal/common/eal_thread.h b/lib/librte_eal/common/eal_thread.h
index a25ee86..28edf51 100644
--- a/lib/librte_eal/common/eal_thread.h
+++ b/lib/librte_eal/common/eal_thread.h
@@ -102,4 +102,18 @@ eal_cpuset_socket_id(rte_cpuset_t *cpusetp)
 	return socket_id;
 }
 
+/**
+ * Dump the current pthread cpuset.
+ * This function is private to EAL.
+ *
+ * @param str
+ *   The string buffer the cpuset will dump to.
+ * @param size
+ *   The string buffer size.
+ */
+#define CPU_STR_LEN            256
+void
+eal_thread_dump_affinity(char str[], unsigned size);
+
+
 #endif /* EAL_THREAD_H */
diff --git a/lib/librte_eal/common/include/rte_lcore.h b/lib/librte_eal/common/include/rte_lcore.h
index 4c7d6bb..facdbdc 100644
--- a/lib/librte_eal/common/include/rte_lcore.h
+++ b/lib/librte_eal/common/include/rte_lcore.h
@@ -43,6 +43,7 @@
 #include <rte_per_lcore.h>
 #include <rte_eal.h>
 #include <rte_launch.h>
+#include <rte_memory.h>
 
 #ifdef __cplusplus
 extern "C" {
@@ -80,7 +81,9 @@ struct lcore_config {
  */
 extern struct lcore_config lcore_config[RTE_MAX_LCORE];
 
-RTE_DECLARE_PER_LCORE(unsigned, _lcore_id); /**< Per core "core id". */
+RTE_DECLARE_PER_LCORE(unsigned, _lcore_id);  /**< Per thread "lcore id". */
+RTE_DECLARE_PER_LCORE(unsigned, _socket_id); /**< Per thread "socket id". */
+RTE_DECLARE_PER_LCORE(rte_cpuset_t, _cpuset); /**< Per thread "cpuset". */
 
 /**
  * Return the ID of the execution unit we are running on.
@@ -146,7 +149,7 @@ rte_lcore_index(int lcore_id)
 static inline unsigned
 rte_socket_id(void)
 {
-	return lcore_config[rte_lcore_id()].socket_id;
+	return RTE_PER_LCORE(_socket_id);
 }
 
 /**
@@ -229,6 +232,28 @@ rte_get_next_lcore(unsigned i, int skip_master, int wrap)
 	     i<RTE_MAX_LCORE;						\
 	     i = rte_get_next_lcore(i, 1, 0))
 
+/**
+ * Set core affinity of the current thread.
+ * Support both EAL and none-EAL thread and update TLS.
+ *
+ * @param cpusetp
+ *   Point to cpu_set_t for setting current thread affinity.
+ * @return
+ *   On success, return 0; otherwise return -1;
+ */
+int rte_thread_set_affinity(rte_cpuset_t *cpusetp);
+
+/**
+ * Get core affinity of the current thread.
+ *
+ * @param cpusetp
+ *   Point to cpu_set_t for getting current thread cpu affinity.
+ * @return
+ *   On success, return 0; otherwise return -1;
+ */
+int rte_thread_get_affinity(rte_cpuset_t *cpusetp);
+
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_eal/linuxapp/eal/eal_thread.c b/lib/librte_eal/linuxapp/eal/eal_thread.c
index 80a985f..748a83a 100644
--- a/lib/librte_eal/linuxapp/eal/eal_thread.c
+++ b/lib/librte_eal/linuxapp/eal/eal_thread.c
@@ -56,6 +56,8 @@
 #include "eal_thread.h"
 
 RTE_DEFINE_PER_LCORE(unsigned, _lcore_id);
+RTE_DEFINE_PER_LCORE(unsigned, _socket_id);
+RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset);
 
 /*
  * Send a message to a slave lcore identified by slave_id to call a
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v1 05/15] eal: add eal_common_thread.c for common thread API
  2015-01-22  8:16 ` [dpdk-dev] [PATCH v1 00/15] support multi-pthread per core Cunming Liang
                     ` (3 preceding siblings ...)
  2015-01-22  8:16   ` [dpdk-dev] [PATCH v1 04/15] eal: new TLS definition and API declaration Cunming Liang
@ 2015-01-22  8:16   ` Cunming Liang
  2015-01-22  8:16   ` [dpdk-dev] [PATCH v1 06/15] eal: add rte_gettid() to acquire unique system tid Cunming Liang
                     ` (11 subsequent siblings)
  16 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-22  8:16 UTC (permalink / raw)
  To: dev

The API works for both EAL thread and none EAL thread.
When calling rte_thread_set_affinity, the *_socket_id* and
*_cpuset* of calling thread will be updated if the thread
successful set the cpu affinity.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/bsdapp/eal/Makefile        |   1 +
 lib/librte_eal/common/eal_common_thread.c | 142 ++++++++++++++++++++++++++++++
 lib/librte_eal/linuxapp/eal/Makefile      |   2 +
 3 files changed, 145 insertions(+)
 create mode 100644 lib/librte_eal/common/eal_common_thread.c

diff --git a/lib/librte_eal/bsdapp/eal/Makefile b/lib/librte_eal/bsdapp/eal/Makefile
index d434882..78406be 100644
--- a/lib/librte_eal/bsdapp/eal/Makefile
+++ b/lib/librte_eal/bsdapp/eal/Makefile
@@ -73,6 +73,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_hexdump.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_devargs.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_dev.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_options.c
+SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_thread.c
 
 CFLAGS_eal.o := -D_GNU_SOURCE
 #CFLAGS_eal_thread.o := -D_GNU_SOURCE
diff --git a/lib/librte_eal/common/eal_common_thread.c b/lib/librte_eal/common/eal_common_thread.c
new file mode 100644
index 0000000..d996690
--- /dev/null
+++ b/lib/librte_eal/common/eal_common_thread.c
@@ -0,0 +1,142 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <unistd.h>
+#include <pthread.h>
+#include <sched.h>
+
+#include <rte_lcore.h>
+#include <rte_log.h>
+#include <rte_memcpy.h>
+
+#include "eal_thread.h"
+
+int
+rte_thread_set_affinity(rte_cpuset_t *cpusetp)
+{
+	int s;
+	unsigned lcore_id;
+	pthread_t tid;
+
+	if (!cpusetp)
+		return -1;
+
+	lcore_id = rte_lcore_id();
+	if (lcore_id != (unsigned)LCORE_ID_ANY) {
+		/* EAL thread */
+		tid = lcore_config[lcore_id].thread_id;
+
+		s = pthread_setaffinity_np(tid, sizeof(rte_cpuset_t), cpusetp);
+		if (s != 0) {
+			RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
+			return -1;
+		}
+
+		/* store socket_id in TLS for quick access */
+		RTE_PER_LCORE(_socket_id) =
+			eal_cpuset_socket_id(cpusetp);
+
+		/* store cpuset in TLS for quick access */
+		rte_memcpy(&RTE_PER_LCORE(_cpuset), cpusetp,
+			   sizeof(rte_cpuset_t));
+
+		/* update lcore_config */
+		lcore_config[lcore_id].socket_id = RTE_PER_LCORE(_socket_id);
+		rte_memcpy(&lcore_config[lcore_id].cpuset, cpusetp,
+			   sizeof(rte_cpuset_t));
+	} else {
+		/* none EAL thread */
+		tid = pthread_self();
+
+		s = pthread_setaffinity_np(tid, sizeof(rte_cpuset_t), cpusetp);
+		if (s != 0) {
+			RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
+			return -1;
+		}
+
+		/* store cpuset in TLS for quick access */
+		rte_memcpy(&RTE_PER_LCORE(_cpuset), cpusetp,
+			   sizeof(rte_cpuset_t));
+
+		/* store socket_id in TLS for quick access */
+		RTE_PER_LCORE(_socket_id) =
+			eal_cpuset_socket_id(cpusetp);
+	}
+
+	return 0;
+}
+
+int
+rte_thread_get_affinity(rte_cpuset_t *cpusetp)
+{
+	if (!cpusetp)
+		return -1;
+
+	rte_memcpy(cpusetp, &RTE_PER_LCORE(_cpuset),
+		   sizeof(rte_cpuset_t));
+
+	return 0;
+}
+
+void
+eal_thread_dump_affinity(char str[], unsigned size)
+{
+	rte_cpuset_t cpuset;
+	unsigned cpu;
+	int ret;
+	unsigned int out = 0;
+
+	if (rte_thread_get_affinity(&cpuset) < 0) {
+		str[0] = '\0';
+		return;
+	}
+
+	for (cpu = 0; cpu < RTE_MAX_LCORE; cpu++) {
+		if (!CPU_ISSET(cpu, &cpuset))
+			continue;
+
+		ret = snprintf(str + out,
+			       size - out, "%u,", cpu);
+		if (ret < 0 || (unsigned)ret >= size - out)
+			break;
+
+		out += ret;
+	}
+
+	/* remove the last separator */
+	if (out > 0)
+		str[out - 1] = '\0';
+}
diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile
index 025d836..07e21ca 100644
--- a/lib/librte_eal/linuxapp/eal/Makefile
+++ b/lib/librte_eal/linuxapp/eal/Makefile
@@ -85,6 +85,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_common_hexdump.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_common_devargs.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_common_dev.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_common_options.c
+SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_common_thread.c
 
 CFLAGS_eal.o := -D_GNU_SOURCE
 CFLAGS_eal_lcore.o := -D_GNU_SOURCE
@@ -96,6 +97,7 @@ CFLAGS_eal_pci.o := -D_GNU_SOURCE
 CFLAGS_eal_pci_vfio.o := -D_GNU_SOURCE
 CFLAGS_eal_common_whitelist.o := -D_GNU_SOURCE
 CFLAGS_eal_common_options.o := -D_GNU_SOURCE
+CFLAGS_eal_common_thread.o := -D_GNU_SOURCE
 
 # workaround for a gcc bug with noreturn attribute
 # http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v1 06/15] eal: add rte_gettid() to acquire unique system tid
  2015-01-22  8:16 ` [dpdk-dev] [PATCH v1 00/15] support multi-pthread per core Cunming Liang
                     ` (4 preceding siblings ...)
  2015-01-22  8:16   ` [dpdk-dev] [PATCH v1 05/15] eal: add eal_common_thread.c for common thread API Cunming Liang
@ 2015-01-22  8:16   ` Cunming Liang
  2015-01-22  8:16   ` [dpdk-dev] [PATCH v1 07/15] eal: apply affinity of EAL thread by assigned cpuset Cunming Liang
                     ` (10 subsequent siblings)
  16 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-22  8:16 UTC (permalink / raw)
  To: dev

The rte_gettid() wraps the linux and freebsd syscall gettid().
It provides a persistent unique thread id for the calling thread.
It will save the unique id in TLS on the first time.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/bsdapp/eal/eal_thread.c   |  9 +++++++++
 lib/librte_eal/common/include/rte_eal.h  | 27 +++++++++++++++++++++++++++
 lib/librte_eal/linuxapp/eal/eal_thread.c |  7 +++++++
 3 files changed, 43 insertions(+)

diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c b/lib/librte_eal/bsdapp/eal/eal_thread.c
index 10220c7..d0c077b 100644
--- a/lib/librte_eal/bsdapp/eal/eal_thread.c
+++ b/lib/librte_eal/bsdapp/eal/eal_thread.c
@@ -39,6 +39,7 @@
 #include <sched.h>
 #include <pthread_np.h>
 #include <sys/queue.h>
+#include <sys/thr.h>
 
 #include <rte_debug.h>
 #include <rte_atomic.h>
@@ -233,3 +234,11 @@ eal_thread_loop(__attribute__((unused)) void *arg)
 	/* pthread_exit(NULL); */
 	/* return NULL; */
 }
+
+/* require calling thread tid by gettid() */
+int rte_sys_gettid(void)
+{
+	long lwpid;
+	thr_self(&lwpid);
+	return (int)lwpid;
+}
diff --git a/lib/librte_eal/common/include/rte_eal.h b/lib/librte_eal/common/include/rte_eal.h
index f4ecd2e..8ccdd65 100644
--- a/lib/librte_eal/common/include/rte_eal.h
+++ b/lib/librte_eal/common/include/rte_eal.h
@@ -41,6 +41,9 @@
  */
 
 #include <stdint.h>
+#include <sched.h>
+
+#include <rte_per_lcore.h>
 
 #ifdef __cplusplus
 extern "C" {
@@ -262,6 +265,30 @@ rte_set_application_usage_hook( rte_usage_hook_t usage_func );
  */
 int rte_eal_has_hugepages(void);
 
+/**
+ * A wrap API for syscall gettid.
+ *
+ * @return
+ *   On success, returns the thread ID of calling process.
+ *   It always successful.
+ */
+int rte_sys_gettid(void);
+
+/**
+ * Get system unique thread id.
+ *
+ * @return
+ *   On success, returns the thread ID of calling process.
+ *   It always successful.
+ */
+static inline int rte_gettid(void)
+{
+	static RTE_DEFINE_PER_LCORE(int, _thread_id) = -1;
+	if (RTE_PER_LCORE(_thread_id) == -1)
+		RTE_PER_LCORE(_thread_id) = rte_sys_gettid();
+	return RTE_PER_LCORE(_thread_id);
+}
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_eal/linuxapp/eal/eal_thread.c b/lib/librte_eal/linuxapp/eal/eal_thread.c
index 748a83a..ed20c93 100644
--- a/lib/librte_eal/linuxapp/eal/eal_thread.c
+++ b/lib/librte_eal/linuxapp/eal/eal_thread.c
@@ -39,6 +39,7 @@
 #include <pthread.h>
 #include <sched.h>
 #include <sys/queue.h>
+#include <sys/syscall.h>
 
 #include <rte_debug.h>
 #include <rte_atomic.h>
@@ -233,3 +234,9 @@ eal_thread_loop(__attribute__((unused)) void *arg)
 	/* pthread_exit(NULL); */
 	/* return NULL; */
 }
+
+/* require calling thread tid by gettid() */
+int rte_sys_gettid(void)
+{
+	return (int)syscall(SYS_gettid);
+}
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v1 07/15] eal: apply affinity of EAL thread by assigned cpuset
  2015-01-22  8:16 ` [dpdk-dev] [PATCH v1 00/15] support multi-pthread per core Cunming Liang
                     ` (5 preceding siblings ...)
  2015-01-22  8:16   ` [dpdk-dev] [PATCH v1 06/15] eal: add rte_gettid() to acquire unique system tid Cunming Liang
@ 2015-01-22  8:16   ` Cunming Liang
  2015-01-22  8:16   ` [dpdk-dev] [PATCH v1 08/15] enic: fix re-define freebsd compile complain Cunming Liang
                     ` (9 subsequent siblings)
  16 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-22  8:16 UTC (permalink / raw)
  To: dev

EAL threads use assigned cpuset to set core affinity during startup.
It keeps 1:1 mapping, if no '--lcores' option is used.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/bsdapp/eal/eal.c          | 13 ++++---
 lib/librte_eal/bsdapp/eal/eal_thread.c   | 63 +++++++++---------------------
 lib/librte_eal/linuxapp/eal/eal.c        |  7 +++-
 lib/librte_eal/linuxapp/eal/eal_thread.c | 67 +++++++++++---------------------
 4 files changed, 54 insertions(+), 96 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
index 69f3c03..98c5a83 100644
--- a/lib/librte_eal/bsdapp/eal/eal.c
+++ b/lib/librte_eal/bsdapp/eal/eal.c
@@ -432,6 +432,7 @@ rte_eal_init(int argc, char **argv)
 	int i, fctret, ret;
 	pthread_t thread_id;
 	static rte_atomic32_t run_once = RTE_ATOMIC32_INIT(0);
+	char cpuset[CPU_STR_LEN];
 
 	if (!rte_atomic32_test_and_set(&run_once))
 		return -1;
@@ -502,13 +503,17 @@ rte_eal_init(int argc, char **argv)
 	if (rte_eal_pci_init() < 0)
 		rte_panic("Cannot init PCI\n");
 
-	RTE_LOG(DEBUG, EAL, "Master core %u is ready (tid=%p)\n",
-		rte_config.master_lcore, thread_id);
-
 	eal_check_mem_on_local_socket();
 
 	rte_eal_mcfg_complete();
 
+	eal_thread_init_master(rte_config.master_lcore);
+
+	eal_thread_dump_affinity(cpuset, CPU_STR_LEN);
+
+	RTE_LOG(DEBUG, EAL, "Master lcore %u is ready (tid=%p;cpuset=[%s])\n",
+		rte_config.master_lcore, thread_id, cpuset);
+
 	if (rte_eal_dev_init() < 0)
 		rte_panic("Cannot init pmd devices\n");
 
@@ -532,8 +537,6 @@ rte_eal_init(int argc, char **argv)
 			rte_panic("Cannot create thread\n");
 	}
 
-	eal_thread_init_master(rte_config.master_lcore);
-
 	/*
 	 * Launch a dummy function on all slave lcores, so that master lcore
 	 * knows they are all ready when this function returns.
diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c b/lib/librte_eal/bsdapp/eal/eal_thread.c
index d0c077b..5b16302 100644
--- a/lib/librte_eal/bsdapp/eal/eal_thread.c
+++ b/lib/librte_eal/bsdapp/eal/eal_thread.c
@@ -103,55 +103,27 @@ eal_thread_set_affinity(void)
 {
 	int s;
 	pthread_t thread;
-
-/*
- * According to the section VERSIONS of the CPU_ALLOC man page:
- *
- * The CPU_ZERO(), CPU_SET(), CPU_CLR(), and CPU_ISSET() macros were added
- * in glibc 2.3.3.
- *
- * CPU_COUNT() first appeared in glibc 2.6.
- *
- * CPU_AND(),     CPU_OR(),     CPU_XOR(),    CPU_EQUAL(),    CPU_ALLOC(),
- * CPU_ALLOC_SIZE(), CPU_FREE(), CPU_ZERO_S(),  CPU_SET_S(),  CPU_CLR_S(),
- * CPU_ISSET_S(),  CPU_AND_S(), CPU_OR_S(), CPU_XOR_S(), and CPU_EQUAL_S()
- * first appeared in glibc 2.7.
- */
-#if defined(CPU_ALLOC)
-	size_t size;
-	cpu_set_t *cpusetp;
-
-	cpusetp = CPU_ALLOC(RTE_MAX_LCORE);
-	if (cpusetp == NULL) {
-		RTE_LOG(ERR, EAL, "CPU_ALLOC failed\n");
-		return -1;
-	}
-
-	size = CPU_ALLOC_SIZE(RTE_MAX_LCORE);
-	CPU_ZERO_S(size, cpusetp);
-	CPU_SET_S(rte_lcore_id(), size, cpusetp);
+	unsigned lcore_id = rte_lcore_id();
 
 	thread = pthread_self();
-	s = pthread_setaffinity_np(thread, size, cpusetp);
+	s = pthread_setaffinity_np(thread, sizeof(cpuset_t),
+				   &lcore_config[lcore_id].cpuset);
 	if (s != 0) {
 		RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
-		CPU_FREE(cpusetp);
 		return -1;
 	}
 
-	CPU_FREE(cpusetp);
-#else /* CPU_ALLOC */
-	cpuset_t cpuset;
-	CPU_ZERO( &cpuset );
-	CPU_SET( rte_lcore_id(), &cpuset );
+	/* acquire system unique id  */
+	rte_gettid();
+
+	/* store socket_id in TLS for quick access */
+	RTE_PER_LCORE(_socket_id) =
+		eal_cpuset_socket_id(&lcore_config[lcore_id].cpuset);
+
+	CPU_COPY(&lcore_config[lcore_id].cpuset, &RTE_PER_LCORE(_cpuset));
+
+	lcore_config[lcore_id].socket_id = RTE_PER_LCORE(_socket_id);
 
-	thread = pthread_self();
-	s = pthread_setaffinity_np(thread, sizeof( cpuset ), &cpuset);
-	if (s != 0) {
-		RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
-		return -1;
-	}
-#endif
 	return 0;
 }
 
@@ -174,6 +146,7 @@ eal_thread_loop(__attribute__((unused)) void *arg)
 	unsigned lcore_id;
 	pthread_t thread_id;
 	int m2s, s2m;
+	char cpuset[CPU_STR_LEN];
 
 	thread_id = pthread_self();
 
@@ -185,9 +158,6 @@ eal_thread_loop(__attribute__((unused)) void *arg)
 	if (lcore_id == RTE_MAX_LCORE)
 		rte_panic("cannot retrieve lcore id\n");
 
-	RTE_LOG(DEBUG, EAL, "Core %u is ready (tid=%p)\n",
-		lcore_id, thread_id);
-
 	m2s = lcore_config[lcore_id].pipe_master2slave[0];
 	s2m = lcore_config[lcore_id].pipe_slave2master[1];
 
@@ -198,6 +168,11 @@ eal_thread_loop(__attribute__((unused)) void *arg)
 	if (eal_thread_set_affinity() < 0)
 		rte_panic("cannot set affinity\n");
 
+	eal_thread_dump_affinity(cpuset, CPU_STR_LEN);
+
+	RTE_LOG(DEBUG, EAL, "lcore %u is ready (tid=%p;cpuset=[%s])\n",
+		lcore_id, thread_id, cpuset);
+
 	/* read on our pipe to get commands */
 	while (1) {
 		void *fct_arg;
diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index f99e158..c95adec 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -702,6 +702,7 @@ rte_eal_init(int argc, char **argv)
 	static rte_atomic32_t run_once = RTE_ATOMIC32_INIT(0);
 	struct shared_driver *solib = NULL;
 	const char *logid;
+	char cpuset[CPU_STR_LEN];
 
 	if (!rte_atomic32_test_and_set(&run_once))
 		return -1;
@@ -802,8 +803,10 @@ rte_eal_init(int argc, char **argv)
 
 	eal_thread_init_master(rte_config.master_lcore);
 
-	RTE_LOG(DEBUG, EAL, "Master core %u is ready (tid=%x)\n",
-		rte_config.master_lcore, (int)thread_id);
+	eal_thread_dump_affinity(cpuset, CPU_STR_LEN);
+
+	RTE_LOG(DEBUG, EAL, "Master lcore %u is ready (tid=%x;cpuset=[%s])\n",
+		rte_config.master_lcore, (int)thread_id, cpuset);
 
 	if (rte_eal_dev_init() < 0)
 		rte_panic("Cannot init pmd devices\n");
diff --git a/lib/librte_eal/linuxapp/eal/eal_thread.c b/lib/librte_eal/linuxapp/eal/eal_thread.c
index ed20c93..6eb1525 100644
--- a/lib/librte_eal/linuxapp/eal/eal_thread.c
+++ b/lib/librte_eal/linuxapp/eal/eal_thread.c
@@ -52,6 +52,7 @@
 #include <rte_eal.h>
 #include <rte_per_lcore.h>
 #include <rte_lcore.h>
+#include <rte_memcpy.h>
 
 #include "eal_private.h"
 #include "eal_thread.h"
@@ -97,61 +98,34 @@ rte_eal_remote_launch(int (*f)(void *), void *arg, unsigned slave_id)
 	return 0;
 }
 
-/* set affinity for current thread */
+/* set affinity for current EAL thread */
 static int
 eal_thread_set_affinity(void)
 {
 	int s;
 	pthread_t thread;
-
-/*
- * According to the section VERSIONS of the CPU_ALLOC man page:
- *
- * The CPU_ZERO(), CPU_SET(), CPU_CLR(), and CPU_ISSET() macros were added
- * in glibc 2.3.3.
- *
- * CPU_COUNT() first appeared in glibc 2.6.
- *
- * CPU_AND(),     CPU_OR(),     CPU_XOR(),    CPU_EQUAL(),    CPU_ALLOC(),
- * CPU_ALLOC_SIZE(), CPU_FREE(), CPU_ZERO_S(),  CPU_SET_S(),  CPU_CLR_S(),
- * CPU_ISSET_S(),  CPU_AND_S(), CPU_OR_S(), CPU_XOR_S(), and CPU_EQUAL_S()
- * first appeared in glibc 2.7.
- */
-#if defined(CPU_ALLOC)
-	size_t size;
-	cpu_set_t *cpusetp;
-
-	cpusetp = CPU_ALLOC(RTE_MAX_LCORE);
-	if (cpusetp == NULL) {
-		RTE_LOG(ERR, EAL, "CPU_ALLOC failed\n");
-		return -1;
-	}
-
-	size = CPU_ALLOC_SIZE(RTE_MAX_LCORE);
-	CPU_ZERO_S(size, cpusetp);
-	CPU_SET_S(rte_lcore_id(), size, cpusetp);
+	unsigned lcore_id = rte_lcore_id();
 
 	thread = pthread_self();
-	s = pthread_setaffinity_np(thread, size, cpusetp);
+	s = pthread_setaffinity_np(thread, sizeof(cpu_set_t),
+				   &lcore_config[lcore_id].cpuset);
 	if (s != 0) {
 		RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
-		CPU_FREE(cpusetp);
 		return -1;
 	}
 
-	CPU_FREE(cpusetp);
-#else /* CPU_ALLOC */
-	cpu_set_t cpuset;
-	CPU_ZERO( &cpuset );
-	CPU_SET( rte_lcore_id(), &cpuset );
+	/* acquire system unique id  */
+	rte_gettid();
+
+	/* store socket_id in TLS for quick access */
+	RTE_PER_LCORE(_socket_id) =
+		eal_cpuset_socket_id(&lcore_config[lcore_id].cpuset);
+
+	rte_memcpy(&RTE_PER_LCORE(_cpuset),
+		   &lcore_config[lcore_id].cpuset, sizeof(rte_cpuset_t));
+
+	lcore_config[lcore_id].socket_id = RTE_PER_LCORE(_socket_id);
 
-	thread = pthread_self();
-	s = pthread_setaffinity_np(thread, sizeof( cpuset ), &cpuset);
-	if (s != 0) {
-		RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
-		return -1;
-	}
-#endif
 	return 0;
 }
 
@@ -174,6 +148,7 @@ eal_thread_loop(__attribute__((unused)) void *arg)
 	unsigned lcore_id;
 	pthread_t thread_id;
 	int m2s, s2m;
+	char cpuset[CPU_STR_LEN];
 
 	thread_id = pthread_self();
 
@@ -185,9 +160,6 @@ eal_thread_loop(__attribute__((unused)) void *arg)
 	if (lcore_id == RTE_MAX_LCORE)
 		rte_panic("cannot retrieve lcore id\n");
 
-	RTE_LOG(DEBUG, EAL, "Core %u is ready (tid=%x)\n",
-		lcore_id, (int)thread_id);
-
 	m2s = lcore_config[lcore_id].pipe_master2slave[0];
 	s2m = lcore_config[lcore_id].pipe_slave2master[1];
 
@@ -198,6 +170,11 @@ eal_thread_loop(__attribute__((unused)) void *arg)
 	if (eal_thread_set_affinity() < 0)
 		rte_panic("cannot set affinity\n");
 
+	eal_thread_dump_affinity(cpuset, CPU_STR_LEN);
+
+	RTE_LOG(DEBUG, EAL, "lcore %u is ready (tid=%x;cpuset=[%s])\n",
+		lcore_id, (int)thread_id, cpuset);
+
 	/* read on our pipe to get commands */
 	while (1) {
 		void *fct_arg;
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v1 08/15] enic: fix re-define freebsd compile complain
  2015-01-22  8:16 ` [dpdk-dev] [PATCH v1 00/15] support multi-pthread per core Cunming Liang
                     ` (6 preceding siblings ...)
  2015-01-22  8:16   ` [dpdk-dev] [PATCH v1 07/15] eal: apply affinity of EAL thread by assigned cpuset Cunming Liang
@ 2015-01-22  8:16   ` Cunming Liang
  2015-01-22  8:16   ` [dpdk-dev] [PATCH v1 09/15] malloc: fix the issue of SOCKET_ID_ANY Cunming Liang
                     ` (8 subsequent siblings)
  16 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-22  8:16 UTC (permalink / raw)
  To: dev

Some macro already been defined by freebsd 'sys/param.h'.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_pmd_enic/enic.h        | 1 +
 lib/librte_pmd_enic/enic_compat.h | 1 +
 2 files changed, 2 insertions(+)

diff --git a/lib/librte_pmd_enic/enic.h b/lib/librte_pmd_enic/enic.h
index c43417c..189c3b9 100644
--- a/lib/librte_pmd_enic/enic.h
+++ b/lib/librte_pmd_enic/enic.h
@@ -66,6 +66,7 @@
 #define ENIC_CALC_IP_CKSUM      1
 #define ENIC_CALC_TCP_UDP_CKSUM 2
 #define ENIC_MAX_MTU            9000
+#undef PAGE_SIZE
 #define PAGE_SIZE               4096
 #define PAGE_ROUND_UP(x) \
 	((((unsigned long)(x)) + PAGE_SIZE-1) & (~(PAGE_SIZE-1)))
diff --git a/lib/librte_pmd_enic/enic_compat.h b/lib/librte_pmd_enic/enic_compat.h
index b1af838..b84c766 100644
--- a/lib/librte_pmd_enic/enic_compat.h
+++ b/lib/librte_pmd_enic/enic_compat.h
@@ -67,6 +67,7 @@
 #define pr_warn(y, args...) dev_warning(0, y, ##args)
 #define BUG() pr_err("BUG at %s:%d", __func__, __LINE__)
 
+#undef ALIGN
 #define ALIGN(x, a)              __ALIGN_MASK(x, (typeof(x))(a)-1)
 #define __ALIGN_MASK(x, mask)    (((x)+(mask))&~(mask))
 #define udelay usleep
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v1 09/15] malloc: fix the issue of SOCKET_ID_ANY
  2015-01-22  8:16 ` [dpdk-dev] [PATCH v1 00/15] support multi-pthread per core Cunming Liang
                     ` (7 preceding siblings ...)
  2015-01-22  8:16   ` [dpdk-dev] [PATCH v1 08/15] enic: fix re-define freebsd compile complain Cunming Liang
@ 2015-01-22  8:16   ` Cunming Liang
  2015-01-25 23:04     ` Stephen Hemminger
  2015-01-26 13:48     ` Stephen Hemminger
  2015-01-22  8:16   ` [dpdk-dev] [PATCH v1 10/15] log: fix the gap to support non-EAL thread Cunming Liang
                     ` (7 subsequent siblings)
  16 siblings, 2 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-22  8:16 UTC (permalink / raw)
  To: dev

Add check for rte_socket_id(), avoid get unexpected return like (-1).

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_malloc/malloc_heap.h | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/lib/librte_malloc/malloc_heap.h b/lib/librte_malloc/malloc_heap.h
index b4aec45..a47136d 100644
--- a/lib/librte_malloc/malloc_heap.h
+++ b/lib/librte_malloc/malloc_heap.h
@@ -44,7 +44,12 @@ extern "C" {
 static inline unsigned
 malloc_get_numa_socket(void)
 {
-	return rte_socket_id();
+	unsigned socket_id = rte_socket_id();
+
+	if (socket_id == (unsigned)SOCKET_ID_ANY)
+		return 0;
+
+	return socket_id;
 }
 
 void *
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v1 10/15] log: fix the gap to support non-EAL thread
  2015-01-22  8:16 ` [dpdk-dev] [PATCH v1 00/15] support multi-pthread per core Cunming Liang
                     ` (8 preceding siblings ...)
  2015-01-22  8:16   ` [dpdk-dev] [PATCH v1 09/15] malloc: fix the issue of SOCKET_ID_ANY Cunming Liang
@ 2015-01-22  8:16   ` Cunming Liang
  2015-01-22  8:16   ` [dpdk-dev] [PATCH v1 11/15] eal: set _lcore_id and _socket_id to (-1) by default Cunming Liang
                     ` (6 subsequent siblings)
  16 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-22  8:16 UTC (permalink / raw)
  To: dev

For those non-EAL thread, *_lcore_id* is invalid and probably larger than RTE_MAX_LCORE.
The patch adds the check and allows only EAL thread using EAL per thread log level and log type.
Others shares the global log level.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/common/eal_common_log.c  | 17 +++++++++++++++--
 lib/librte_eal/common/include/rte_log.h |  5 +++++
 2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_log.c b/lib/librte_eal/common/eal_common_log.c
index cf57619..e8dc94a 100644
--- a/lib/librte_eal/common/eal_common_log.c
+++ b/lib/librte_eal/common/eal_common_log.c
@@ -193,11 +193,20 @@ rte_set_log_type(uint32_t type, int enable)
 		rte_logs.type &= (~type);
 }
 
+/* Get global log type */
+uint32_t
+rte_get_log_type(void)
+{
+	return rte_logs.type;
+}
+
 /* get the current loglevel for the message beeing processed */
 int rte_log_cur_msg_loglevel(void)
 {
 	unsigned lcore_id;
 	lcore_id = rte_lcore_id();
+	if (lcore_id >= RTE_MAX_LCORE)
+		return rte_get_log_level();
 	return log_cur_msg[lcore_id].loglevel;
 }
 
@@ -206,6 +215,8 @@ int rte_log_cur_msg_logtype(void)
 {
 	unsigned lcore_id;
 	lcore_id = rte_lcore_id();
+	if (lcore_id >= RTE_MAX_LCORE)
+		return rte_get_log_type();
 	return log_cur_msg[lcore_id].logtype;
 }
 
@@ -265,8 +276,10 @@ rte_vlog(__attribute__((unused)) uint32_t level,
 
 	/* save loglevel and logtype in a global per-lcore variable */
 	lcore_id = rte_lcore_id();
-	log_cur_msg[lcore_id].loglevel = level;
-	log_cur_msg[lcore_id].logtype = logtype;
+	if (lcore_id < RTE_MAX_LCORE) {
+		log_cur_msg[lcore_id].loglevel = level;
+		log_cur_msg[lcore_id].logtype = logtype;
+	}
 
 	ret = vfprintf(f, format, ap);
 	fflush(f);
diff --git a/lib/librte_eal/common/include/rte_log.h b/lib/librte_eal/common/include/rte_log.h
index db1ea08..f83a0d9 100644
--- a/lib/librte_eal/common/include/rte_log.h
+++ b/lib/librte_eal/common/include/rte_log.h
@@ -144,6 +144,11 @@ uint32_t rte_get_log_level(void);
 void rte_set_log_type(uint32_t type, int enable);
 
 /**
+ * Get the global log type.
+ */
+uint32_t rte_get_log_type(void);
+
+/**
  * Get the current loglevel for the message being processed.
  *
  * Before calling the user-defined stream for logging, the log
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v1 11/15] eal: set _lcore_id and _socket_id to (-1) by default
  2015-01-22  8:16 ` [dpdk-dev] [PATCH v1 00/15] support multi-pthread per core Cunming Liang
                     ` (9 preceding siblings ...)
  2015-01-22  8:16   ` [dpdk-dev] [PATCH v1 10/15] log: fix the gap to support non-EAL thread Cunming Liang
@ 2015-01-22  8:16   ` Cunming Liang
  2015-01-22  8:16   ` [dpdk-dev] [PATCH v1 12/15] eal: fix recursive spinlock in non-EAL thraed Cunming Liang
                     ` (5 subsequent siblings)
  16 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-22  8:16 UTC (permalink / raw)
  To: dev

For those none EAL thread, *_lcore_id* shall always be LCORE_ID_ANY.
The libraries using *_lcore_id* as index need to take care.
*_socket_id* always be SOCKET_ID_ANY unitl the thread changes the affinity
by rte_thread_set_affinity()

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/bsdapp/eal/eal_thread.c   | 4 ++--
 lib/librte_eal/linuxapp/eal/eal_thread.c | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c b/lib/librte_eal/bsdapp/eal/eal_thread.c
index 5b16302..2b3c9a8 100644
--- a/lib/librte_eal/bsdapp/eal/eal_thread.c
+++ b/lib/librte_eal/bsdapp/eal/eal_thread.c
@@ -56,8 +56,8 @@
 #include "eal_private.h"
 #include "eal_thread.h"
 
-RTE_DEFINE_PER_LCORE(unsigned, _lcore_id);
-RTE_DEFINE_PER_LCORE(unsigned, _socket_id);
+RTE_DEFINE_PER_LCORE(unsigned, _lcore_id) = (unsigned)LCORE_ID_ANY;
+RTE_DEFINE_PER_LCORE(unsigned, _socket_id) = (unsigned)SOCKET_ID_ANY;
 RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset);
 
 /*
diff --git a/lib/librte_eal/linuxapp/eal/eal_thread.c b/lib/librte_eal/linuxapp/eal/eal_thread.c
index 6eb1525..ab94e20 100644
--- a/lib/librte_eal/linuxapp/eal/eal_thread.c
+++ b/lib/librte_eal/linuxapp/eal/eal_thread.c
@@ -57,8 +57,8 @@
 #include "eal_private.h"
 #include "eal_thread.h"
 
-RTE_DEFINE_PER_LCORE(unsigned, _lcore_id);
-RTE_DEFINE_PER_LCORE(unsigned, _socket_id);
+RTE_DEFINE_PER_LCORE(unsigned, _lcore_id) = (unsigned)LCORE_ID_ANY;
+RTE_DEFINE_PER_LCORE(unsigned, _socket_id) = (unsigned)SOCKET_ID_ANY;
 RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset);
 
 /*
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v1 12/15] eal: fix recursive spinlock in non-EAL thraed
  2015-01-22  8:16 ` [dpdk-dev] [PATCH v1 00/15] support multi-pthread per core Cunming Liang
                     ` (10 preceding siblings ...)
  2015-01-22  8:16   ` [dpdk-dev] [PATCH v1 11/15] eal: set _lcore_id and _socket_id to (-1) by default Cunming Liang
@ 2015-01-22  8:16   ` Cunming Liang
  2015-01-22  8:16   ` [dpdk-dev] [PATCH v1 13/15] mempool: add support to non-EAL thread Cunming Liang
                     ` (4 subsequent siblings)
  16 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-22  8:16 UTC (permalink / raw)
  To: dev

In non-EAL thread, lcore_id alrways be LCORE_ID_ANY.
It cann't be used as unique id for recursive spinlock.
Then use rte_gettid() to replace it.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/common/include/generic/rte_spinlock.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/include/generic/rte_spinlock.h b/lib/librte_eal/common/include/generic/rte_spinlock.h
index dea885c..c7fb0df 100644
--- a/lib/librte_eal/common/include/generic/rte_spinlock.h
+++ b/lib/librte_eal/common/include/generic/rte_spinlock.h
@@ -179,7 +179,7 @@ static inline void rte_spinlock_recursive_init(rte_spinlock_recursive_t *slr)
  */
 static inline void rte_spinlock_recursive_lock(rte_spinlock_recursive_t *slr)
 {
-	int id = rte_lcore_id();
+	int id = rte_gettid();
 
 	if (slr->user != id) {
 		rte_spinlock_lock(&slr->sl);
@@ -212,7 +212,7 @@ static inline void rte_spinlock_recursive_unlock(rte_spinlock_recursive_t *slr)
  */
 static inline int rte_spinlock_recursive_trylock(rte_spinlock_recursive_t *slr)
 {
-	int id = rte_lcore_id();
+	int id = rte_gettid();
 
 	if (slr->user != id) {
 		if (rte_spinlock_trylock(&slr->sl) == 0)
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v1 13/15] mempool: add support to non-EAL thread
  2015-01-22  8:16 ` [dpdk-dev] [PATCH v1 00/15] support multi-pthread per core Cunming Liang
                     ` (11 preceding siblings ...)
  2015-01-22  8:16   ` [dpdk-dev] [PATCH v1 12/15] eal: fix recursive spinlock in non-EAL thraed Cunming Liang
@ 2015-01-22  8:16   ` Cunming Liang
  2015-01-22  9:52     ` Walukiewicz, Miroslaw
  2015-01-22  8:16   ` [dpdk-dev] [PATCH v1 14/15] ring: " Cunming Liang
                     ` (3 subsequent siblings)
  16 siblings, 1 reply; 253+ messages in thread
From: Cunming Liang @ 2015-01-22  8:16 UTC (permalink / raw)
  To: dev

For non-EAL thread, bypass per lcore cache, directly use ring pool.
It allows using rte_mempool in either EAL thread or any user pthread.
As in non-EAL thread, it directly rely on rte_ring and it's none preemptive.
It doesn't suggest to run multi-pthread/cpu which compete the rte_mempool.
It will get bad performance and has critical risk if scheduling policy is RT.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_mempool/rte_mempool.h | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 3314651..4845f27 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -198,10 +198,12 @@ struct rte_mempool {
  *   Number to add to the object-oriented statistics.
  */
 #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
-#define __MEMPOOL_STAT_ADD(mp, name, n) do {			\
-		unsigned __lcore_id = rte_lcore_id();		\
-		mp->stats[__lcore_id].name##_objs += n;		\
-		mp->stats[__lcore_id].name##_bulk += 1;		\
+#define __MEMPOOL_STAT_ADD(mp, name, n) do {                    \
+		unsigned __lcore_id = rte_lcore_id();           \
+		if (__lcore_id < RTE_MAX_LCORE) {               \
+			mp->stats[__lcore_id].name##_objs += n;	\
+			mp->stats[__lcore_id].name##_bulk += 1;	\
+		}                                               \
 	} while(0)
 #else
 #define __MEMPOOL_STAT_ADD(mp, name, n) do {} while(0)
@@ -767,8 +769,9 @@ __mempool_put_bulk(struct rte_mempool *mp, void * const *obj_table,
 	__MEMPOOL_STAT_ADD(mp, put, n);
 
 #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0
-	/* cache is not enabled or single producer */
-	if (unlikely(cache_size == 0 || is_mp == 0))
+	/* cache is not enabled or single producer or none EAL thread */
+	if (unlikely(cache_size == 0 || is_mp == 0 ||
+		     lcore_id >= RTE_MAX_LCORE))
 		goto ring_enqueue;
 
 	/* Go straight to ring if put would overflow mem allocated for cache */
@@ -952,7 +955,8 @@ __mempool_get_bulk(struct rte_mempool *mp, void **obj_table,
 	uint32_t cache_size = mp->cache_size;
 
 	/* cache is not enabled or single consumer */
-	if (unlikely(cache_size == 0 || is_mc == 0 || n >= cache_size))
+	if (unlikely(cache_size == 0 || is_mc == 0 ||
+		     n >= cache_size || lcore_id >= RTE_MAX_LCORE))
 		goto ring_dequeue;
 
 	cache = &mp->local_cache[lcore_id];
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v1 14/15] ring: add support to non-EAL thread
  2015-01-22  8:16 ` [dpdk-dev] [PATCH v1 00/15] support multi-pthread per core Cunming Liang
                     ` (12 preceding siblings ...)
  2015-01-22  8:16   ` [dpdk-dev] [PATCH v1 13/15] mempool: add support to non-EAL thread Cunming Liang
@ 2015-01-22  8:16   ` Cunming Liang
  2015-01-22  8:16   ` [dpdk-dev] [PATCH v1 15/15] timer: " Cunming Liang
                     ` (2 subsequent siblings)
  16 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-22  8:16 UTC (permalink / raw)
  To: dev

ring debug stat won't take care non-EAL thread.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_ring/rte_ring.h | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index 7cd5f2d..39bacdd 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -188,10 +188,12 @@ struct rte_ring {
  *   The number to add to the object-oriented statistics.
  */
 #ifdef RTE_LIBRTE_RING_DEBUG
-#define __RING_STAT_ADD(r, name, n) do {		\
-		unsigned __lcore_id = rte_lcore_id();	\
-		r->stats[__lcore_id].name##_objs += n;	\
-		r->stats[__lcore_id].name##_bulk += 1;	\
+#define __RING_STAT_ADD(r, name, n) do {                        \
+		unsigned __lcore_id = rte_lcore_id();           \
+		if (__lcore_id < RTE_MAX_LCORE) {               \
+			r->stats[__lcore_id].name##_objs += n;  \
+			r->stats[__lcore_id].name##_bulk += 1;  \
+		}                                               \
 	} while(0)
 #else
 #define __RING_STAT_ADD(r, name, n) do {} while(0)
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v1 15/15] timer: add support to non-EAL thread
  2015-01-22  8:16 ` [dpdk-dev] [PATCH v1 00/15] support multi-pthread per core Cunming Liang
                     ` (13 preceding siblings ...)
  2015-01-22  8:16   ` [dpdk-dev] [PATCH v1 14/15] ring: " Cunming Liang
@ 2015-01-22  8:16   ` Cunming Liang
  2015-01-22  9:58     ` Walukiewicz, Miroslaw
  2015-01-22 14:14   ` [dpdk-dev] [PATCH v1 00/15] support multi-pthread per core Ananyev, Konstantin
  2015-01-28  6:59   ` [dpdk-dev] [PATCH v2 " Cunming Liang
  16 siblings, 1 reply; 253+ messages in thread
From: Cunming Liang @ 2015-01-22  8:16 UTC (permalink / raw)
  To: dev

Allow to setup timers only for EAL (lcore) threads (__lcore_id < MAX_LCORE_ID).
E.g. – dynamically created thread will be able to reset/stop timer for lcore thread,
but it will be not allowed to setup timer for itself or another non-lcore thread.
rte_timer_manage() for non-lcore thread would simply do nothing and return straightway.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_timer/rte_timer.c | 40 +++++++++++++++++++++++++++++++---------
 lib/librte_timer/rte_timer.h |  2 +-
 2 files changed, 32 insertions(+), 10 deletions(-)

diff --git a/lib/librte_timer/rte_timer.c b/lib/librte_timer/rte_timer.c
index 269a992..601c159 100644
--- a/lib/librte_timer/rte_timer.c
+++ b/lib/librte_timer/rte_timer.c
@@ -79,9 +79,10 @@ static struct priv_timer priv_timer[RTE_MAX_LCORE];
 
 /* when debug is enabled, store some statistics */
 #ifdef RTE_LIBRTE_TIMER_DEBUG
-#define __TIMER_STAT_ADD(name, n) do {				\
-		unsigned __lcore_id = rte_lcore_id();		\
-		priv_timer[__lcore_id].stats.name += (n);	\
+#define __TIMER_STAT_ADD(name, n) do {					\
+		unsigned __lcore_id = rte_lcore_id();			\
+		if (__lcore_id < RTE_MAX_LCORE)				\
+			priv_timer[__lcore_id].stats.name += (n);	\
 	} while(0)
 #else
 #define __TIMER_STAT_ADD(name, n) do {} while(0)
@@ -127,15 +128,26 @@ timer_set_config_state(struct rte_timer *tim,
 	unsigned lcore_id;
 
 	lcore_id = rte_lcore_id();
+	if (lcore_id >= RTE_MAX_LCORE)
+		lcore_id = LCORE_ID_ANY;
 
 	/* wait that the timer is in correct status before update,
 	 * and mark it as being configured */
 	while (success == 0) {
 		prev_status.u32 = tim->status.u32;
 
+		/*
+		 * prevent race condition of non-EAL threads
+		 * to update the timer. When 'owner == LCORE_ID_ANY',
+		 * it means updated by a non-EAL thread.
+		 */
+		if (lcore_id == (unsigned)LCORE_ID_ANY &&
+		    (uint16_t)lcore_id == prev_status.owner)
+			return -1;
+
 		/* timer is running on another core, exit */
 		if (prev_status.state == RTE_TIMER_RUNNING &&
-		    (unsigned)prev_status.owner != lcore_id)
+		    prev_status.owner != (uint16_t)lcore_id)
 			return -1;
 
 		/* timer is being configured on another core */
@@ -366,9 +378,13 @@ __rte_timer_reset(struct rte_timer *tim, uint64_t expire,
 
 	/* round robin for tim_lcore */
 	if (tim_lcore == (unsigned)LCORE_ID_ANY) {
-		tim_lcore = rte_get_next_lcore(priv_timer[lcore_id].prev_lcore,
-					       0, 1);
-		priv_timer[lcore_id].prev_lcore = tim_lcore;
+		if (lcore_id < RTE_MAX_LCORE) {
+			tim_lcore = rte_get_next_lcore(
+				priv_timer[lcore_id].prev_lcore,
+				0, 1);
+			priv_timer[lcore_id].prev_lcore = tim_lcore;
+		} else
+			tim_lcore = rte_get_next_lcore(LCORE_ID_ANY, 0, 1);
 	}
 
 	/* wait that the timer is in correct status before update,
@@ -378,7 +394,8 @@ __rte_timer_reset(struct rte_timer *tim, uint64_t expire,
 		return -1;
 
 	__TIMER_STAT_ADD(reset, 1);
-	if (prev_status.state == RTE_TIMER_RUNNING) {
+	if (prev_status.state == RTE_TIMER_RUNNING &&
+	    lcore_id < RTE_MAX_LCORE) {
 		priv_timer[lcore_id].updated = 1;
 	}
 
@@ -455,7 +472,8 @@ rte_timer_stop(struct rte_timer *tim)
 		return -1;
 
 	__TIMER_STAT_ADD(stop, 1);
-	if (prev_status.state == RTE_TIMER_RUNNING) {
+	if (prev_status.state == RTE_TIMER_RUNNING &&
+	    lcore_id < RTE_MAX_LCORE) {
 		priv_timer[lcore_id].updated = 1;
 	}
 
@@ -499,6 +517,10 @@ void rte_timer_manage(void)
 	uint64_t cur_time;
 	int i, ret;
 
+	/* timer manager only runs on EAL thread */
+	if (lcore_id >= RTE_MAX_LCORE)
+		return;
+
 	__TIMER_STAT_ADD(manage, 1);
 	/* optimize for the case where per-cpu list is empty */
 	if (priv_timer[lcore_id].pending_head.sl_next[0] == NULL)
diff --git a/lib/librte_timer/rte_timer.h b/lib/librte_timer/rte_timer.h
index 4907cf5..5c5df91 100644
--- a/lib/librte_timer/rte_timer.h
+++ b/lib/librte_timer/rte_timer.h
@@ -76,7 +76,7 @@ extern "C" {
 #define RTE_TIMER_RUNNING 2 /**< State: timer function is running. */
 #define RTE_TIMER_CONFIG  3 /**< State: timer is being configured. */
 
-#define RTE_TIMER_NO_OWNER -1 /**< Timer has no owner. */
+#define RTE_TIMER_NO_OWNER -2 /**< Timer has no owner. */
 
 /**
  * Timer type: Periodic or single (one-shot).
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v1 13/15] mempool: add support to non-EAL thread
  2015-01-22  8:16   ` [dpdk-dev] [PATCH v1 13/15] mempool: add support to non-EAL thread Cunming Liang
@ 2015-01-22  9:52     ` Walukiewicz, Miroslaw
  2015-01-22 12:20       ` Liang, Cunming
  0 siblings, 1 reply; 253+ messages in thread
From: Walukiewicz, Miroslaw @ 2015-01-22  9:52 UTC (permalink / raw)
  To: Liang, Cunming, dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cunming Liang
> Sent: Thursday, January 22, 2015 9:17 AM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v1 13/15] mempool: add support to non-EAL
> thread
> 
> For non-EAL thread, bypass per lcore cache, directly use ring pool.
> It allows using rte_mempool in either EAL thread or any user pthread.
> As in non-EAL thread, it directly rely on rte_ring and it's none preemptive.
> It doesn't suggest to run multi-pthread/cpu which compete the
> rte_mempool.
> It will get bad performance and has critical risk if scheduling policy is RT.
> 
> Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> ---
>  lib/librte_mempool/rte_mempool.h | 18 +++++++++++-------
>  1 file changed, 11 insertions(+), 7 deletions(-)
> 
> diff --git a/lib/librte_mempool/rte_mempool.h
> b/lib/librte_mempool/rte_mempool.h
> index 3314651..4845f27 100644
> --- a/lib/librte_mempool/rte_mempool.h
> +++ b/lib/librte_mempool/rte_mempool.h
> @@ -198,10 +198,12 @@ struct rte_mempool {
>   *   Number to add to the object-oriented statistics.
>   */
>  #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
> -#define __MEMPOOL_STAT_ADD(mp, name, n) do {			\
> -		unsigned __lcore_id = rte_lcore_id();		\
> -		mp->stats[__lcore_id].name##_objs += n;		\
> -		mp->stats[__lcore_id].name##_bulk += 1;		\
> +#define __MEMPOOL_STAT_ADD(mp, name, n) do {                    \
> +		unsigned __lcore_id = rte_lcore_id();           \
> +		if (__lcore_id < RTE_MAX_LCORE) {               \
> +			mp->stats[__lcore_id].name##_objs += n;	\
> +			mp->stats[__lcore_id].name##_bulk += 1;	\
> +		}                                               \
>  	} while(0)
>  #else
>  #define __MEMPOOL_STAT_ADD(mp, name, n) do {} while(0)
> @@ -767,8 +769,9 @@ __mempool_put_bulk(struct rte_mempool *mp,
> void * const *obj_table,
>  	__MEMPOOL_STAT_ADD(mp, put, n);
> 
>  #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0
> -	/* cache is not enabled or single producer */
> -	if (unlikely(cache_size == 0 || is_mp == 0))
> +	/* cache is not enabled or single producer or none EAL thread */

I don't understand this limitation. 

I see that the rte_membuf.h defines table per RTE_MAX_LCORE like below 
#if RTE_MEMPOOL_CACHE_MAX_SIZE > 0
        /** Per-lcore local cache. */
        struct rte_mempool_cache local_cache[RTE_MAX_LCORE];
#endif

But why we cannot extent the size of the local cache table to something like RTE_MAX_THREADS that does not exceed max value of rte_lcore_id()

Keeping this condition here is a  real performance killer!!!!!!. 
I saw in my test application spending more 95% of CPU time reading the atomic in M C/MP ring utilizing access to mempool. 

Same comment for get operation below

> +	if (unlikely(cache_size == 0 || is_mp == 0 ||
> +		     lcore_id >= RTE_MAX_LCORE))
>  		goto ring_enqueue;
> 
>  	/* Go straight to ring if put would overflow mem allocated for cache
> */
> @@ -952,7 +955,8 @@ __mempool_get_bulk(struct rte_mempool *mp, void
> **obj_table,
>  	uint32_t cache_size = mp->cache_size;
> 
>  	/* cache is not enabled or single consumer */
> -	if (unlikely(cache_size == 0 || is_mc == 0 || n >= cache_size))
> +	if (unlikely(cache_size == 0 || is_mc == 0 ||
> +		     n >= cache_size || lcore_id >= RTE_MAX_LCORE))
>  		goto ring_dequeue;
> 
>  	cache = &mp->local_cache[lcore_id];
> --
> 1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v1 15/15] timer: add support to non-EAL thread
  2015-01-22  8:16   ` [dpdk-dev] [PATCH v1 15/15] timer: " Cunming Liang
@ 2015-01-22  9:58     ` Walukiewicz, Miroslaw
  2015-01-22 12:32       ` Liang, Cunming
  0 siblings, 1 reply; 253+ messages in thread
From: Walukiewicz, Miroslaw @ 2015-01-22  9:58 UTC (permalink / raw)
  To: Liang, Cunming, dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cunming Liang
> Sent: Thursday, January 22, 2015 9:17 AM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v1 15/15] timer: add support to non-EAL thread
> 
> Allow to setup timers only for EAL (lcore) threads (__lcore_id <
> MAX_LCORE_ID).
> E.g. – dynamically created thread will be able to reset/stop timer for lcore
> thread,
> but it will be not allowed to setup timer for itself or another non-lcore
> thread.
> rte_timer_manage() for non-lcore thread would simply do nothing and
> return straightway.
> 
> Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> ---
>  lib/librte_timer/rte_timer.c | 40 +++++++++++++++++++++++++++++++----
> -----
>  lib/librte_timer/rte_timer.h |  2 +-
>  2 files changed, 32 insertions(+), 10 deletions(-)
> 
> diff --git a/lib/librte_timer/rte_timer.c b/lib/librte_timer/rte_timer.c
> index 269a992..601c159 100644
> --- a/lib/librte_timer/rte_timer.c
> +++ b/lib/librte_timer/rte_timer.c
> @@ -79,9 +79,10 @@ static struct priv_timer priv_timer[RTE_MAX_LCORE];
> 

Why not extend the priv_timer size to value being in range returned by rte_lcore_id().

All timer stuff will work automatically after such change without any change in timer logic including stats.

>  /* when debug is enabled, store some statistics */
>  #ifdef RTE_LIBRTE_TIMER_DEBUG
> -#define __TIMER_STAT_ADD(name, n) do {				\
> -		unsigned __lcore_id = rte_lcore_id();		\
> -		priv_timer[__lcore_id].stats.name += (n);	\
> +#define __TIMER_STAT_ADD(name, n) do {
> 	\
> +		unsigned __lcore_id = rte_lcore_id();			\
> +		if (__lcore_id < RTE_MAX_LCORE)
> 	\
> +			priv_timer[__lcore_id].stats.name += (n);	\
>  	} while(0)
>  #else
>  #define __TIMER_STAT_ADD(name, n) do {} while(0)
> @@ -127,15 +128,26 @@ timer_set_config_state(struct rte_timer *tim,
>  	unsigned lcore_id;
> 
>  	lcore_id = rte_lcore_id();
> +	if (lcore_id >= RTE_MAX_LCORE)
> +		lcore_id = LCORE_ID_ANY;
> 
>  	/* wait that the timer is in correct status before update,
>  	 * and mark it as being configured */
>  	while (success == 0) {
>  		prev_status.u32 = tim->status.u32;
> 
> +		/*
> +		 * prevent race condition of non-EAL threads
> +		 * to update the timer. When 'owner == LCORE_ID_ANY',
> +		 * it means updated by a non-EAL thread.
> +		 */
> +		if (lcore_id == (unsigned)LCORE_ID_ANY &&
> +		    (uint16_t)lcore_id == prev_status.owner)
> +			return -1;
> +
>  		/* timer is running on another core, exit */
>  		if (prev_status.state == RTE_TIMER_RUNNING &&
> -		    (unsigned)prev_status.owner != lcore_id)
> +		    prev_status.owner != (uint16_t)lcore_id)
>  			return -1;
> 
>  		/* timer is being configured on another core */
> @@ -366,9 +378,13 @@ __rte_timer_reset(struct rte_timer *tim, uint64_t
> expire,
> 
>  	/* round robin for tim_lcore */
>  	if (tim_lcore == (unsigned)LCORE_ID_ANY) {
> -		tim_lcore =
> rte_get_next_lcore(priv_timer[lcore_id].prev_lcore,
> -					       0, 1);
> -		priv_timer[lcore_id].prev_lcore = tim_lcore;
> +		if (lcore_id < RTE_MAX_LCORE) {
> +			tim_lcore = rte_get_next_lcore(
> +				priv_timer[lcore_id].prev_lcore,
> +				0, 1);
> +			priv_timer[lcore_id].prev_lcore = tim_lcore;
> +		} else
> +			tim_lcore = rte_get_next_lcore(LCORE_ID_ANY, 0,
> 1);
>  	}
> 
>  	/* wait that the timer is in correct status before update,
> @@ -378,7 +394,8 @@ __rte_timer_reset(struct rte_timer *tim, uint64_t
> expire,
>  		return -1;
> 
>  	__TIMER_STAT_ADD(reset, 1);
> -	if (prev_status.state == RTE_TIMER_RUNNING) {
> +	if (prev_status.state == RTE_TIMER_RUNNING &&
> +	    lcore_id < RTE_MAX_LCORE) {
>  		priv_timer[lcore_id].updated = 1;
>  	}
> 
> @@ -455,7 +472,8 @@ rte_timer_stop(struct rte_timer *tim)
>  		return -1;
> 
>  	__TIMER_STAT_ADD(stop, 1);
> -	if (prev_status.state == RTE_TIMER_RUNNING) {
> +	if (prev_status.state == RTE_TIMER_RUNNING &&
> +	    lcore_id < RTE_MAX_LCORE) {
>  		priv_timer[lcore_id].updated = 1;
>  	}
> 
> @@ -499,6 +517,10 @@ void rte_timer_manage(void)
>  	uint64_t cur_time;
>  	int i, ret;
> 
> +	/* timer manager only runs on EAL thread */
> +	if (lcore_id >= RTE_MAX_LCORE)
> +		return;
> +
>  	__TIMER_STAT_ADD(manage, 1);
>  	/* optimize for the case where per-cpu list is empty */
>  	if (priv_timer[lcore_id].pending_head.sl_next[0] == NULL)
> diff --git a/lib/librte_timer/rte_timer.h b/lib/librte_timer/rte_timer.h
> index 4907cf5..5c5df91 100644
> --- a/lib/librte_timer/rte_timer.h
> +++ b/lib/librte_timer/rte_timer.h
> @@ -76,7 +76,7 @@ extern "C" {
>  #define RTE_TIMER_RUNNING 2 /**< State: timer function is running. */
>  #define RTE_TIMER_CONFIG  3 /**< State: timer is being configured. */
> 
> -#define RTE_TIMER_NO_OWNER -1 /**< Timer has no owner. */
> +#define RTE_TIMER_NO_OWNER -2 /**< Timer has no owner. */
> 
>  /**
>   * Timer type: Periodic or single (one-shot).
> --
> 1.8.1.4


^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v1 02/15] eal: new eal option '--lcores' for cpu assignment
  2015-01-22  8:16   ` [dpdk-dev] [PATCH v1 02/15] eal: new eal option '--lcores' for cpu assignment Cunming Liang
@ 2015-01-22 12:19     ` Bruce Richardson
  2015-01-22 14:34       ` Ananyev, Konstantin
  2015-01-23  0:39       ` Liang, Cunming
  0 siblings, 2 replies; 253+ messages in thread
From: Bruce Richardson @ 2015-01-22 12:19 UTC (permalink / raw)
  To: Cunming Liang; +Cc: dev

On Thu, Jan 22, 2015 at 04:16:25PM +0800, Cunming Liang wrote:
> It supports one new eal long option '--lcores' for EAL thread cpuset assignment.
> 
> The format pattern:
> 	--lcores='lcores[@cpus]<,lcores[@cpus]>'
> lcores, cpus could be a single digit or a group.
> '(' and ')' are necessary if it's a group.
> If not supply '@cpus', the value of cpus uses the same as lcores.
> 
> e.g. '1,2@(5-7),(3-5)@(0,2),(0,6)' means starting 7 EAL thread as below
>   lcore 0 runs on cpuset 0x41 (cpu 0,6)
>   lcore 1 runs on cpuset 0x2 (cpu 1)
>   lcore 2 runs on cpuset 0xe0 (cpu 5,6,7)
>   lcore 3,4,5 runs on cpuset 0x5 (cpu 0,2)
>   lcore 6 runs on cpuset 0x41 (cpu 0,6)
> 

This strikes me as very confusing, though a couple of tweaks might help with
readability. The lcore 0 at the end is especially confusing. Perhaps we can 
limit the allowed formats here,
* require the lcore_id to be specified - the lack of an lcore id for the last part
makes having it as lcore 0 surprising.
* only allow one lcore id to be given for each set of cores. 

I think it may still be readable if we allow the core set to be omitted if its
to be the same as the lcore_id.

It's probably still not going to be very tidy, but I think we can improve things.

/Bruce

> Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> ---
>  lib/librte_eal/common/eal_common_launch.c  |   1 -
>  lib/librte_eal/common/eal_common_options.c | 262 ++++++++++++++++++++++++++++-
>  lib/librte_eal/common/eal_options.h        |   2 +
>  lib/librte_eal/linuxapp/eal/Makefile       |   1 +
>  4 files changed, 261 insertions(+), 5 deletions(-)
> 
> diff --git a/lib/librte_eal/common/eal_common_launch.c b/lib/librte_eal/common/eal_common_launch.c
> index 599f83b..2d732b1 100644
> --- a/lib/librte_eal/common/eal_common_launch.c
> +++ b/lib/librte_eal/common/eal_common_launch.c
> @@ -117,4 +117,3 @@ rte_eal_mp_wait_lcore(void)
>  		rte_eal_wait_lcore(lcore_id);
>  	}
>  }
> -
> diff --git a/lib/librte_eal/common/eal_common_options.c b/lib/librte_eal/common/eal_common_options.c
> index e2810ab..fc47588 100644
> --- a/lib/librte_eal/common/eal_common_options.c
> +++ b/lib/librte_eal/common/eal_common_options.c
> @@ -45,6 +45,7 @@
>  #include <rte_lcore.h>
>  #include <rte_version.h>
>  #include <rte_devargs.h>
> +#include <rte_memcpy.h>
>  
>  #include "eal_internal_cfg.h"
>  #include "eal_options.h"
> @@ -85,6 +86,7 @@ eal_long_options[] = {
>  	{OPT_XEN_DOM0, 0, 0, OPT_XEN_DOM0_NUM},
>  	{OPT_CREATE_UIO_DEV, 1, NULL, OPT_CREATE_UIO_DEV_NUM},
>  	{OPT_VFIO_INTR, 1, NULL, OPT_VFIO_INTR_NUM},
> +	{OPT_LCORES, 1, 0, OPT_LCORES_NUM},
>  	{0, 0, 0, 0}
>  };
>  
> @@ -255,9 +257,11 @@ eal_parse_corelist(const char *corelist)
>  			if (min == RTE_MAX_LCORE)
>  				min = idx;
>  			for (idx = min; idx <= max; idx++) {
> -				cfg->lcore_role[idx] = ROLE_RTE;
> -				lcore_config[idx].core_index = count;
> -				count++;
> +				if (cfg->lcore_role[idx] != ROLE_RTE) {
> +					cfg->lcore_role[idx] = ROLE_RTE;
> +					lcore_config[idx].core_index = count;
> +					count++;
> +				}
>  			}
>  			min = RTE_MAX_LCORE;
>  		} else
> @@ -289,6 +293,241 @@ eal_parse_master_lcore(const char *arg)
>  	return 0;
>  }
>  
> +/*
> + * Parse elem, the elem could be single number or '(' ')' group
> + * Within group elem, '-' used for a range seperator;
> + *                    ',' used for a single number.
> + */
> +static int
> +eal_parse_set(const char *input, uint16_t set[], unsigned num)
> +{
> +	unsigned idx;
> +	const char *str = input;
> +	char *end = NULL;
> +	unsigned min, max;
> +
> +	memset(set, 0, num * sizeof(uint16_t));
> +
> +	while (isblank(*str))
> +		str++;
> +
> +	/* only digit or left bracket is qulify for start point */
> +	if ((!isdigit(*str) && *str != '(') || *str == '\0')
> +		return -1;
> +
> +	/* process single number */
> +	if (*str != '(') {
> +		errno = 0;
> +		idx = strtoul(str, &end, 10);
> +		if (errno || end == NULL || idx >= num)
> +			return -1;
> +		else {
> +			while (isblank(*end))
> +				end++;
> +
> +			if (*end != ',' && *end != '\0' &&
> +			    *end != '@')
> +				return -1;
> +
> +			set[idx] = 1;
> +			return end - input;
> +		}
> +	}
> +
> +	/* process set within bracket */
> +	str++;
> +	while (isblank(*str))
> +		str++;
> +	if (*str == '\0')
> +		return -1;
> +
> +	min = RTE_MAX_LCORE;
> +	do {
> +
> +		/* go ahead to the first digit */
> +		while (isblank(*str))
> +			str++;
> +		if (!isdigit(*str))
> +			return -1;
> +
> +		/* get the digit value */
> +		errno = 0;
> +		idx = strtoul(str, &end, 10);
> +		if (errno || end == NULL || idx >= num)
> +			return -1;
> +
> +		/* go ahead to separator '-',',' and ')' */
> +		while (isblank(*end))
> +			end++;
> +		if (*end == '-') {
> +			if (min == RTE_MAX_LCORE)
> +				min = idx;
> +			else /* avoid continuous '-' */
> +				return -1;
> +		} else if ((*end == ',') || (*end == ')')) {
> +			max = idx;
> +			if (min == RTE_MAX_LCORE)
> +				min = idx;
> +			for (idx = RTE_MIN(min, max);
> +			     idx <= RTE_MAX(min, max); idx++)
> +				set[idx] = 1;
> +
> +			min = RTE_MAX_LCORE;
> +		} else
> +			return -1;
> +
> +		str = end + 1;
> +	} while (*end != '\0' && *end != ')');
> +
> +	return str - input;
> +}
> +
> +/* convert from set array to cpuset bitmap */
> +static inline int
> +convert_to_cpuset(rte_cpuset_t *cpusetp,
> +	      uint16_t *set, unsigned num)
> +{
> +	unsigned idx;
> +
> +	CPU_ZERO(cpusetp);
> +
> +	for (idx = 0; idx < num; idx++) {
> +		if (!set[idx])
> +			continue;
> +
> +		if (!lcore_config[idx].detected) {
> +			RTE_LOG(ERR, EAL, "core %u "
> +				"unavailable\n", idx);
> +			return -1;
> +		}
> +
> +		CPU_SET(idx, cpusetp);
> +	}
> +
> +	return 0;
> +}
> +
> +/*
> + * The format pattern: --lcores='lcores[@cpus]<,lcores[@cpus]>'
> + * lcores, cpus could be a single digit or a group.
> + * '(' and ')' are necessary if it's a group.
> + * If not supply '@cpus', the value of cpus uses the same as lcores.
> + * e.g. '1,2@(5-7),(3-5)@(0,2),(0,6)' means start 7 EAL thread as below
> + *   lcore 0 runs on cpuset 0x41 (cpu 0,6)
> + *   lcore 1 runs on cpuset 0x2 (cpu 1)
> + *   lcore 2 runs on cpuset 0xe0 (cpu 5,6,7)
> + *   lcore 3,4,5 runs on cpuset 0x5 (cpu 0,2)
> + *   lcore 6 runs on cpuset 0x41 (cpu 0,6)
> + */
> +static int
> +eal_parse_lcores(const char *lcores)
> +{
> +	struct rte_config *cfg = rte_eal_get_configuration();
> +	static uint16_t set[RTE_MAX_LCORE];
> +	unsigned idx = 0;
> +	int i;
> +	unsigned count = 0;
> +	const char *lcore_start = NULL;
> +	const char *end = NULL;
> +	int offset;
> +	rte_cpuset_t cpuset;
> +	int ret = -1;
> +
> +	if (lcores == NULL)
> +		return -1;
> +
> +	/* Remove all blank characters ahead and after */
> +	while (isblank(*lcores))
> +		lcores++;
> +	i = strnlen(lcores, sysconf(_SC_ARG_MAX));
> +	while ((i > 0) && isblank(lcores[i - 1]))
> +		i--;
> +
> +	CPU_ZERO(&cpuset);
> +
> +	/* Reset lcore config */
> +	for (idx = 0; idx < RTE_MAX_LCORE; idx++) {
> +		cfg->lcore_role[idx] = ROLE_OFF;
> +		lcore_config[idx].core_index = -1;
> +		CPU_ZERO(&lcore_config[idx].cpuset);
> +	}
> +
> +	/* Get list of cores */
> +	do {
> +		while (isblank(*lcores))
> +			lcores++;
> +		if (*lcores == '\0')
> +			goto err;
> +
> +		/* record lcore_set start point */
> +		lcore_start = lcores;
> +
> +		/* go across a complete bracket */
> +		if (*lcore_start == '(') {
> +			lcores += strcspn(lcores, ")");
> +			if (*lcores++ == '\0')
> +				goto err;
> +		}
> +
> +		/* scan the separator '@', ','(next) or '\0'(finish) */
> +		lcores += strcspn(lcores, "@,");
> +
> +		if (*lcores == '@') {
> +			/* explict assign cpu_set */
> +			offset = eal_parse_set(lcores + 1, set, RTE_DIM(set));
> +			if (offset < 0)
> +				goto err;
> +
> +			/* prepare cpu_set and update the end cursor */
> +			if (0 > convert_to_cpuset(&cpuset,
> +						  set, RTE_DIM(set)))
> +				goto err;
> +			end = lcores + 1 + offset;
> +		} else  /* ',' or '\0' */
> +			/* haven't given cpu_set, current loop done */
> +			end = lcores;
> +
> +		if (*end != ',' && *end != '\0')
> +			goto err;
> +
> +		/* parse lcore_set from start point */
> +		if (0 > eal_parse_set(lcore_start, set, RTE_DIM(set)))
> +			goto err;
> +
> +		/* without '@', by default using lcore_set as cpu_set */
> +		if (*lcores != '@' &&
> +		    0 > convert_to_cpuset(&cpuset, set, RTE_DIM(set)))
> +			goto err;
> +
> +		/* start to update lcore_set */
> +		for (idx = 0; idx < RTE_MAX_LCORE; idx++) {
> +			if (!set[idx])
> +				continue;
> +
> +			if (cfg->lcore_role[idx] != ROLE_RTE) {
> +				lcore_config[idx].core_index = count;
> +				cfg->lcore_role[idx] = ROLE_RTE;
> +				count++;
> +			}
> +			rte_memcpy(&lcore_config[idx].cpuset, &cpuset,
> +				   sizeof(rte_cpuset_t));
> +		}
> +
> +		lcores = end + 1;
> +	} while (*end != '\0');
> +
> +	if (count == 0)
> +		goto err;
> +
> +	cfg->lcore_count = count;
> +	lcores_parsed = 1;
> +	ret = 0;
> +
> +err:
> +
> +	return ret;
> +}
> +
>  static int
>  eal_parse_syslog(const char *facility, struct internal_config *conf)
>  {
> @@ -489,6 +728,13 @@ eal_parse_common_option(int opt, const char *optarg,
>  		conf->log_level = log;
>  		break;
>  	}
> +	case OPT_LCORES_NUM:
> +		if (eal_parse_lcores(optarg) < 0) {
> +			RTE_LOG(ERR, EAL, "invalid parameter for --"
> +				OPT_LCORES "\n");
> +			return -1;
> +		}
> +		break;
>  
>  	/* don't know what to do, leave this to caller */
>  	default:
> @@ -527,7 +773,7 @@ eal_check_common_options(struct internal_config *internal_cfg)
>  
>  	if (!lcores_parsed) {
>  		RTE_LOG(ERR, EAL, "CPU cores must be enabled with options "
> -			"-c or -l\n");
> +			"-c, -l or --lcores\n");
>  		return -1;
>  	}
>  	if (cfg->lcore_role[cfg->master_lcore] != ROLE_RTE) {
> @@ -583,6 +829,14 @@ eal_common_usage(void)
>  	       "                 The argument format is <c1>[-c2][,c3[-c4],...]\n"
>  	       "                 where c1, c2, etc are core indexes between 0 and %d\n"
>  	       "  --"OPT_MASTER_LCORE" ID: Core ID that is used as master\n"
> +	       "  --"OPT_LCORES" MAP: maps between lcore_set to phys_cpu_set\n"
> +	       "                 The argument format is\n"
> +	       "                       'lcores[@cpus]<,lcores[@cpus],...>'\n"
> +	       "                 lcores and cpus list are grouped by '(' and ')'\n"
> +	       "                 Within the group, '-' is used for range separator,\n"
> +	       "                 ',' is used for single number separator.\n"
> +	       "                 '( )' can be omitted for single element group, '@' \n"
> +	       "                 can be omitted if cpus and lcores has the same value\n"
>  	       "  -n NUM       : Number of memory channels\n"
>  	       "  -v           : Display version information on startup\n"
>  	       "  -m MB        : memory to allocate (see also --"OPT_SOCKET_MEM")\n"
> diff --git a/lib/librte_eal/common/eal_options.h b/lib/librte_eal/common/eal_options.h
> index e476f8d..a1cc59f 100644
> --- a/lib/librte_eal/common/eal_options.h
> +++ b/lib/librte_eal/common/eal_options.h
> @@ -77,6 +77,8 @@ enum {
>  	OPT_CREATE_UIO_DEV_NUM,
>  #define OPT_VFIO_INTR    "vfio-intr"
>  	OPT_VFIO_INTR_NUM,
> +#define OPT_LCORES "lcores"
> +	OPT_LCORES_NUM,
>  	OPT_LONG_MAX_NUM
>  };
>  
> diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile
> index 0e9c447..025d836 100644
> --- a/lib/librte_eal/linuxapp/eal/Makefile
> +++ b/lib/librte_eal/linuxapp/eal/Makefile
> @@ -95,6 +95,7 @@ CFLAGS_eal_hugepage_info.o := -D_GNU_SOURCE
>  CFLAGS_eal_pci.o := -D_GNU_SOURCE
>  CFLAGS_eal_pci_vfio.o := -D_GNU_SOURCE
>  CFLAGS_eal_common_whitelist.o := -D_GNU_SOURCE
> +CFLAGS_eal_common_options.o := -D_GNU_SOURCE
>  
>  # workaround for a gcc bug with noreturn attribute
>  # http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
> -- 
> 1.8.1.4
> 

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v1 13/15] mempool: add support to non-EAL thread
  2015-01-22  9:52     ` Walukiewicz, Miroslaw
@ 2015-01-22 12:20       ` Liang, Cunming
  2015-01-22 12:45         ` Walukiewicz, Miroslaw
  0 siblings, 1 reply; 253+ messages in thread
From: Liang, Cunming @ 2015-01-22 12:20 UTC (permalink / raw)
  To: Walukiewicz, Miroslaw, dev



> -----Original Message-----
> From: Walukiewicz, Miroslaw
> Sent: Thursday, January 22, 2015 5:53 PM
> To: Liang, Cunming; dev@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v1 13/15] mempool: add support to non-EAL
> thread
> 
> 
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cunming Liang
> > Sent: Thursday, January 22, 2015 9:17 AM
> > To: dev@dpdk.org
> > Subject: [dpdk-dev] [PATCH v1 13/15] mempool: add support to non-EAL
> > thread
> >
> > For non-EAL thread, bypass per lcore cache, directly use ring pool.
> > It allows using rte_mempool in either EAL thread or any user pthread.
> > As in non-EAL thread, it directly rely on rte_ring and it's none preemptive.
> > It doesn't suggest to run multi-pthread/cpu which compete the
> > rte_mempool.
> > It will get bad performance and has critical risk if scheduling policy is RT.
> >
> > Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> > ---
> >  lib/librte_mempool/rte_mempool.h | 18 +++++++++++-------
> >  1 file changed, 11 insertions(+), 7 deletions(-)
> >
> > diff --git a/lib/librte_mempool/rte_mempool.h
> > b/lib/librte_mempool/rte_mempool.h
> > index 3314651..4845f27 100644
> > --- a/lib/librte_mempool/rte_mempool.h
> > +++ b/lib/librte_mempool/rte_mempool.h
> > @@ -198,10 +198,12 @@ struct rte_mempool {
> >   *   Number to add to the object-oriented statistics.
> >   */
> >  #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
> > -#define __MEMPOOL_STAT_ADD(mp, name, n) do {			\
> > -		unsigned __lcore_id = rte_lcore_id();		\
> > -		mp->stats[__lcore_id].name##_objs += n;		\
> > -		mp->stats[__lcore_id].name##_bulk += 1;		\
> > +#define __MEMPOOL_STAT_ADD(mp, name, n) do {                    \
> > +		unsigned __lcore_id = rte_lcore_id();           \
> > +		if (__lcore_id < RTE_MAX_LCORE) {               \
> > +			mp->stats[__lcore_id].name##_objs += n;	\
> > +			mp->stats[__lcore_id].name##_bulk += 1;	\
> > +		}                                               \
> >  	} while(0)
> >  #else
> >  #define __MEMPOOL_STAT_ADD(mp, name, n) do {} while(0)
> > @@ -767,8 +769,9 @@ __mempool_put_bulk(struct rte_mempool *mp,
> > void * const *obj_table,
> >  	__MEMPOOL_STAT_ADD(mp, put, n);
> >
> >  #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0
> > -	/* cache is not enabled or single producer */
> > -	if (unlikely(cache_size == 0 || is_mp == 0))
> > +	/* cache is not enabled or single producer or none EAL thread */
> 
> I don't understand this limitation.
> 
> I see that the rte_membuf.h defines table per RTE_MAX_LCORE like below
> #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0
>         /** Per-lcore local cache. */
>         struct rte_mempool_cache local_cache[RTE_MAX_LCORE];
> #endif
> 
> But why we cannot extent the size of the local cache table to something like
> RTE_MAX_THREADS that does not exceed max value of rte_lcore_id()
> 
> Keeping this condition here is a  real performance killer!!!!!!.
> I saw in my test application spending more 95% of CPU time reading the atomic
> in M C/MP ring utilizing access to mempool.
[Liang, Cunming] This is the first step to make it work.
By Konstantin's comments, shall prevent to allocate unique id by ourselves.
And the return value from gettid() is too large as an index.
For non-EAL thread performance gap, will think about additional fix patch here.
If care about performance, still prefer to choose EAL thread now.
> 
> Same comment for get operation below
> 
> > +	if (unlikely(cache_size == 0 || is_mp == 0 ||
> > +		     lcore_id >= RTE_MAX_LCORE))
> >  		goto ring_enqueue;
> >
> >  	/* Go straight to ring if put would overflow mem allocated for cache
> > */
> > @@ -952,7 +955,8 @@ __mempool_get_bulk(struct rte_mempool *mp, void
> > **obj_table,
> >  	uint32_t cache_size = mp->cache_size;
> >
> >  	/* cache is not enabled or single consumer */
> > -	if (unlikely(cache_size == 0 || is_mc == 0 || n >= cache_size))
> > +	if (unlikely(cache_size == 0 || is_mc == 0 ||
> > +		     n >= cache_size || lcore_id >= RTE_MAX_LCORE))
> >  		goto ring_dequeue;
> >
> >  	cache = &mp->local_cache[lcore_id];
> > --
> > 1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v1 15/15] timer: add support to non-EAL thread
  2015-01-22  9:58     ` Walukiewicz, Miroslaw
@ 2015-01-22 12:32       ` Liang, Cunming
  0 siblings, 0 replies; 253+ messages in thread
From: Liang, Cunming @ 2015-01-22 12:32 UTC (permalink / raw)
  To: Walukiewicz, Miroslaw, dev



> -----Original Message-----
> From: Walukiewicz, Miroslaw
> Sent: Thursday, January 22, 2015 5:58 PM
> To: Liang, Cunming; dev@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v1 15/15] timer: add support to non-EAL thread
> 
> 
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cunming Liang
> > Sent: Thursday, January 22, 2015 9:17 AM
> > To: dev@dpdk.org
> > Subject: [dpdk-dev] [PATCH v1 15/15] timer: add support to non-EAL thread
> >
> > Allow to setup timers only for EAL (lcore) threads (__lcore_id <
> > MAX_LCORE_ID).
> > E.g. – dynamically created thread will be able to reset/stop timer for lcore
> > thread,
> > but it will be not allowed to setup timer for itself or another non-lcore
> > thread.
> > rte_timer_manage() for non-lcore thread would simply do nothing and
> > return straightway.
> >
> > Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> > ---
> >  lib/librte_timer/rte_timer.c | 40 +++++++++++++++++++++++++++++++----
> > -----
> >  lib/librte_timer/rte_timer.h |  2 +-
> >  2 files changed, 32 insertions(+), 10 deletions(-)
> >
> > diff --git a/lib/librte_timer/rte_timer.c b/lib/librte_timer/rte_timer.c
> > index 269a992..601c159 100644
> > --- a/lib/librte_timer/rte_timer.c
> > +++ b/lib/librte_timer/rte_timer.c
> > @@ -79,9 +79,10 @@ static struct priv_timer priv_timer[RTE_MAX_LCORE];
> >
> 
> Why not extend the priv_timer size to value being in range returned by
> rte_lcore_id().
> 
> All timer stuff will work automatically after such change without any change in
> timer logic including stats.
[Liang, Cunming] The same reason as mempool does.
It won't expect to involve dynamic unique id allocation for user thread on the first step.
The failure secondary won't release the reserved id which cause potential unexpected leak.
So will look for other approach to improve the libraries in the next step.
> 
> >  /* when debug is enabled, store some statistics */
> >  #ifdef RTE_LIBRTE_TIMER_DEBUG
> > -#define __TIMER_STAT_ADD(name, n) do {				\
> > -		unsigned __lcore_id = rte_lcore_id();		\
> > -		priv_timer[__lcore_id].stats.name += (n);	\
> > +#define __TIMER_STAT_ADD(name, n) do {
> > 	\
> > +		unsigned __lcore_id = rte_lcore_id();			\
> > +		if (__lcore_id < RTE_MAX_LCORE)
> > 	\
> > +			priv_timer[__lcore_id].stats.name += (n);	\
> >  	} while(0)
> >  #else
> >  #define __TIMER_STAT_ADD(name, n) do {} while(0)
> > @@ -127,15 +128,26 @@ timer_set_config_state(struct rte_timer *tim,
> >  	unsigned lcore_id;
> >
> >  	lcore_id = rte_lcore_id();
> > +	if (lcore_id >= RTE_MAX_LCORE)
> > +		lcore_id = LCORE_ID_ANY;
> >
> >  	/* wait that the timer is in correct status before update,
> >  	 * and mark it as being configured */
> >  	while (success == 0) {
> >  		prev_status.u32 = tim->status.u32;
> >
> > +		/*
> > +		 * prevent race condition of non-EAL threads
> > +		 * to update the timer. When 'owner == LCORE_ID_ANY',
> > +		 * it means updated by a non-EAL thread.
> > +		 */
> > +		if (lcore_id == (unsigned)LCORE_ID_ANY &&
> > +		    (uint16_t)lcore_id == prev_status.owner)
> > +			return -1;
> > +
> >  		/* timer is running on another core, exit */
> >  		if (prev_status.state == RTE_TIMER_RUNNING &&
> > -		    (unsigned)prev_status.owner != lcore_id)
> > +		    prev_status.owner != (uint16_t)lcore_id)
> >  			return -1;
> >
> >  		/* timer is being configured on another core */
> > @@ -366,9 +378,13 @@ __rte_timer_reset(struct rte_timer *tim, uint64_t
> > expire,
> >
> >  	/* round robin for tim_lcore */
> >  	if (tim_lcore == (unsigned)LCORE_ID_ANY) {
> > -		tim_lcore =
> > rte_get_next_lcore(priv_timer[lcore_id].prev_lcore,
> > -					       0, 1);
> > -		priv_timer[lcore_id].prev_lcore = tim_lcore;
> > +		if (lcore_id < RTE_MAX_LCORE) {
> > +			tim_lcore = rte_get_next_lcore(
> > +				priv_timer[lcore_id].prev_lcore,
> > +				0, 1);
> > +			priv_timer[lcore_id].prev_lcore = tim_lcore;
> > +		} else
> > +			tim_lcore = rte_get_next_lcore(LCORE_ID_ANY, 0,
> > 1);
> >  	}
> >
> >  	/* wait that the timer is in correct status before update,
> > @@ -378,7 +394,8 @@ __rte_timer_reset(struct rte_timer *tim, uint64_t
> > expire,
> >  		return -1;
> >
> >  	__TIMER_STAT_ADD(reset, 1);
> > -	if (prev_status.state == RTE_TIMER_RUNNING) {
> > +	if (prev_status.state == RTE_TIMER_RUNNING &&
> > +	    lcore_id < RTE_MAX_LCORE) {
> >  		priv_timer[lcore_id].updated = 1;
> >  	}
> >
> > @@ -455,7 +472,8 @@ rte_timer_stop(struct rte_timer *tim)
> >  		return -1;
> >
> >  	__TIMER_STAT_ADD(stop, 1);
> > -	if (prev_status.state == RTE_TIMER_RUNNING) {
> > +	if (prev_status.state == RTE_TIMER_RUNNING &&
> > +	    lcore_id < RTE_MAX_LCORE) {
> >  		priv_timer[lcore_id].updated = 1;
> >  	}
> >
> > @@ -499,6 +517,10 @@ void rte_timer_manage(void)
> >  	uint64_t cur_time;
> >  	int i, ret;
> >
> > +	/* timer manager only runs on EAL thread */
> > +	if (lcore_id >= RTE_MAX_LCORE)
> > +		return;
> > +
> >  	__TIMER_STAT_ADD(manage, 1);
> >  	/* optimize for the case where per-cpu list is empty */
> >  	if (priv_timer[lcore_id].pending_head.sl_next[0] == NULL)
> > diff --git a/lib/librte_timer/rte_timer.h b/lib/librte_timer/rte_timer.h
> > index 4907cf5..5c5df91 100644
> > --- a/lib/librte_timer/rte_timer.h
> > +++ b/lib/librte_timer/rte_timer.h
> > @@ -76,7 +76,7 @@ extern "C" {
> >  #define RTE_TIMER_RUNNING 2 /**< State: timer function is running. */
> >  #define RTE_TIMER_CONFIG  3 /**< State: timer is being configured. */
> >
> > -#define RTE_TIMER_NO_OWNER -1 /**< Timer has no owner. */
> > +#define RTE_TIMER_NO_OWNER -2 /**< Timer has no owner. */
> >
> >  /**
> >   * Timer type: Periodic or single (one-shot).
> > --
> > 1.8.1.4


^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v1 13/15] mempool: add support to non-EAL thread
  2015-01-22 12:20       ` Liang, Cunming
@ 2015-01-22 12:45         ` Walukiewicz, Miroslaw
  2015-01-22 14:04           ` Ananyev, Konstantin
  0 siblings, 1 reply; 253+ messages in thread
From: Walukiewicz, Miroslaw @ 2015-01-22 12:45 UTC (permalink / raw)
  To: Liang, Cunming, dev



> -----Original Message-----
> From: Liang, Cunming
> Sent: Thursday, January 22, 2015 1:20 PM
> To: Walukiewicz, Miroslaw; dev@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v1 13/15] mempool: add support to non-EAL
> thread
> 
> 
> 
> > -----Original Message-----
> > From: Walukiewicz, Miroslaw
> > Sent: Thursday, January 22, 2015 5:53 PM
> > To: Liang, Cunming; dev@dpdk.org
> > Subject: RE: [dpdk-dev] [PATCH v1 13/15] mempool: add support to non-
> EAL
> > thread
> >
> >
> >
> > > -----Original Message-----
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cunming Liang
> > > Sent: Thursday, January 22, 2015 9:17 AM
> > > To: dev@dpdk.org
> > > Subject: [dpdk-dev] [PATCH v1 13/15] mempool: add support to non-EAL
> > > thread
> > >
> > > For non-EAL thread, bypass per lcore cache, directly use ring pool.
> > > It allows using rte_mempool in either EAL thread or any user pthread.
> > > As in non-EAL thread, it directly rely on rte_ring and it's none preemptive.
> > > It doesn't suggest to run multi-pthread/cpu which compete the
> > > rte_mempool.
> > > It will get bad performance and has critical risk if scheduling policy is RT.
> > >
> > > Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> > > ---
> > >  lib/librte_mempool/rte_mempool.h | 18 +++++++++++-------
> > >  1 file changed, 11 insertions(+), 7 deletions(-)
> > >
> > > diff --git a/lib/librte_mempool/rte_mempool.h
> > > b/lib/librte_mempool/rte_mempool.h
> > > index 3314651..4845f27 100644
> > > --- a/lib/librte_mempool/rte_mempool.h
> > > +++ b/lib/librte_mempool/rte_mempool.h
> > > @@ -198,10 +198,12 @@ struct rte_mempool {
> > >   *   Number to add to the object-oriented statistics.
> > >   */
> > >  #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
> > > -#define __MEMPOOL_STAT_ADD(mp, name, n) do {
> 	\
> > > -		unsigned __lcore_id = rte_lcore_id();		\
> > > -		mp->stats[__lcore_id].name##_objs += n;		\
> > > -		mp->stats[__lcore_id].name##_bulk += 1;		\
> > > +#define __MEMPOOL_STAT_ADD(mp, name, n) do {                    \
> > > +		unsigned __lcore_id = rte_lcore_id();           \
> > > +		if (__lcore_id < RTE_MAX_LCORE) {               \
> > > +			mp->stats[__lcore_id].name##_objs += n;	\
> > > +			mp->stats[__lcore_id].name##_bulk += 1;	\
> > > +		}                                               \
> > >  	} while(0)
> > >  #else
> > >  #define __MEMPOOL_STAT_ADD(mp, name, n) do {} while(0)
> > > @@ -767,8 +769,9 @@ __mempool_put_bulk(struct rte_mempool *mp,
> > > void * const *obj_table,
> > >  	__MEMPOOL_STAT_ADD(mp, put, n);
> > >
> > >  #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0
> > > -	/* cache is not enabled or single producer */
> > > -	if (unlikely(cache_size == 0 || is_mp == 0))
> > > +	/* cache is not enabled or single producer or none EAL thread */
> >
> > I don't understand this limitation.
> >
> > I see that the rte_membuf.h defines table per RTE_MAX_LCORE like below
> > #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0
> >         /** Per-lcore local cache. */
> >         struct rte_mempool_cache local_cache[RTE_MAX_LCORE];
> > #endif
> >
> > But why we cannot extent the size of the local cache table to something
> like
> > RTE_MAX_THREADS that does not exceed max value of rte_lcore_id()
> >
> > Keeping this condition here is a  real performance killer!!!!!!.
> > I saw in my test application spending more 95% of CPU time reading the
> atomic
> > in M C/MP ring utilizing access to mempool.
> [Liang, Cunming] This is the first step to make it work.
> By Konstantin's comments, shall prevent to allocate unique id by ourselves.
> And the return value from gettid() is too large as an index.
> For non-EAL thread performance gap, will think about additional fix patch
> here.
> If care about performance, still prefer to choose EAL thread now.

In previous patch you had allocation of the thread id on base of unique gettid() as number 
not a potential pointer as we can expect from implementation getid() from Linux or BSD.

The another problem is that we compare here int with some unique thread identifier.
How can you prevent that when implementation of gettid will change and unique thread identifier will be 
Less than RTE_MAX_LCORE and will be still unique. 

I think that your assumption will work for well-known operating systems but will be very unportable.

Regarding performance the DPDK can work efficiently in different environments including pthreads. 
You can imagine running DPDK from pthread application where affinity will be made by application. 
Effectiveness depends on application thread implementation comparable to EAL threads. 

I think that this is a goal for this change.

> >
> > Same comment for get operation below
> >
> > > +	if (unlikely(cache_size == 0 || is_mp == 0 ||
> > > +		     lcore_id >= RTE_MAX_LCORE))
> > >  		goto ring_enqueue;
> > >
> > >  	/* Go straight to ring if put would overflow mem allocated for cache
> > > */
> > > @@ -952,7 +955,8 @@ __mempool_get_bulk(struct rte_mempool *mp,
> void
> > > **obj_table,
> > >  	uint32_t cache_size = mp->cache_size;
> > >
> > >  	/* cache is not enabled or single consumer */
> > > -	if (unlikely(cache_size == 0 || is_mc == 0 || n >= cache_size))
> > > +	if (unlikely(cache_size == 0 || is_mc == 0 ||
> > > +		     n >= cache_size || lcore_id >= RTE_MAX_LCORE))
> > >  		goto ring_dequeue;
> > >
> > >  	cache = &mp->local_cache[lcore_id];
> > > --
> > > 1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v1 13/15] mempool: add support to non-EAL thread
  2015-01-22 12:45         ` Walukiewicz, Miroslaw
@ 2015-01-22 14:04           ` Ananyev, Konstantin
  0 siblings, 0 replies; 253+ messages in thread
From: Ananyev, Konstantin @ 2015-01-22 14:04 UTC (permalink / raw)
  To: Walukiewicz, Miroslaw, Liang, Cunming, dev


Hi Miroslaw,

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Walukiewicz, Miroslaw
> Sent: Thursday, January 22, 2015 12:45 PM
> To: Liang, Cunming; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v1 13/15] mempool: add support to non-EAL thread
> 
> 
> 
> > -----Original Message-----
> > From: Liang, Cunming
> > Sent: Thursday, January 22, 2015 1:20 PM
> > To: Walukiewicz, Miroslaw; dev@dpdk.org
> > Subject: RE: [dpdk-dev] [PATCH v1 13/15] mempool: add support to non-EAL
> > thread
> >
> >
> >
> > > -----Original Message-----
> > > From: Walukiewicz, Miroslaw
> > > Sent: Thursday, January 22, 2015 5:53 PM
> > > To: Liang, Cunming; dev@dpdk.org
> > > Subject: RE: [dpdk-dev] [PATCH v1 13/15] mempool: add support to non-
> > EAL
> > > thread
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cunming Liang
> > > > Sent: Thursday, January 22, 2015 9:17 AM
> > > > To: dev@dpdk.org
> > > > Subject: [dpdk-dev] [PATCH v1 13/15] mempool: add support to non-EAL
> > > > thread
> > > >
> > > > For non-EAL thread, bypass per lcore cache, directly use ring pool.
> > > > It allows using rte_mempool in either EAL thread or any user pthread.
> > > > As in non-EAL thread, it directly rely on rte_ring and it's none preemptive.
> > > > It doesn't suggest to run multi-pthread/cpu which compete the
> > > > rte_mempool.
> > > > It will get bad performance and has critical risk if scheduling policy is RT.
> > > >
> > > > Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> > > > ---
> > > >  lib/librte_mempool/rte_mempool.h | 18 +++++++++++-------
> > > >  1 file changed, 11 insertions(+), 7 deletions(-)
> > > >
> > > > diff --git a/lib/librte_mempool/rte_mempool.h
> > > > b/lib/librte_mempool/rte_mempool.h
> > > > index 3314651..4845f27 100644
> > > > --- a/lib/librte_mempool/rte_mempool.h
> > > > +++ b/lib/librte_mempool/rte_mempool.h
> > > > @@ -198,10 +198,12 @@ struct rte_mempool {
> > > >   *   Number to add to the object-oriented statistics.
> > > >   */
> > > >  #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
> > > > -#define __MEMPOOL_STAT_ADD(mp, name, n) do {
> > 	\
> > > > -		unsigned __lcore_id = rte_lcore_id();		\
> > > > -		mp->stats[__lcore_id].name##_objs += n;		\
> > > > -		mp->stats[__lcore_id].name##_bulk += 1;		\
> > > > +#define __MEMPOOL_STAT_ADD(mp, name, n) do {                    \
> > > > +		unsigned __lcore_id = rte_lcore_id();           \
> > > > +		if (__lcore_id < RTE_MAX_LCORE) {               \
> > > > +			mp->stats[__lcore_id].name##_objs += n;	\
> > > > +			mp->stats[__lcore_id].name##_bulk += 1;	\
> > > > +		}                                               \
> > > >  	} while(0)
> > > >  #else
> > > >  #define __MEMPOOL_STAT_ADD(mp, name, n) do {} while(0)
> > > > @@ -767,8 +769,9 @@ __mempool_put_bulk(struct rte_mempool *mp,
> > > > void * const *obj_table,
> > > >  	__MEMPOOL_STAT_ADD(mp, put, n);
> > > >
> > > >  #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0
> > > > -	/* cache is not enabled or single producer */
> > > > -	if (unlikely(cache_size == 0 || is_mp == 0))
> > > > +	/* cache is not enabled or single producer or none EAL thread */
> > >
> > > I don't understand this limitation.
> > >
> > > I see that the rte_membuf.h defines table per RTE_MAX_LCORE like below
> > > #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0
> > >         /** Per-lcore local cache. */
> > >         struct rte_mempool_cache local_cache[RTE_MAX_LCORE];
> > > #endif
> > >
> > > But why we cannot extent the size of the local cache table to something
> > like
> > > RTE_MAX_THREADS that does not exceed max value of rte_lcore_id()
> > >
> > > Keeping this condition here is a  real performance killer!!!!!!.
> > > I saw in my test application spending more 95% of CPU time reading the
> > atomic
> > > in M C/MP ring utilizing access to mempool.
> > [Liang, Cunming] This is the first step to make it work.
> > By Konstantin's comments, shall prevent to allocate unique id by ourselves.
> > And the return value from gettid() is too large as an index.
> > For non-EAL thread performance gap, will think about additional fix patch
> > here.
> > If care about performance, still prefer to choose EAL thread now.
> 
> In previous patch you had allocation of the thread id on base of unique gettid() as number
> not a potential pointer as we can expect from implementation getid() from Linux or BSD.

I am really puzzled with your sentence above.
What ' potential pointer' you are talking about?
rte_lcore_id() - returns unsigned 32bit integer (as it always did).
_lcore_id for each EAL thread is assigned at rte_eal_init().
For the EAL thread  _lcore_id value is in interval [0, RTE_MAX_LCORE) and
it is up to the user to make sure that each _lcore_id is unique inside DPDK MultiProcess group.
That's all as it was before. 
What's new with Steve's patch:
1) At startup user can select a set of physical cpu(s) on which each EAL thread (lcore) can run.
>From explanation at  [PATCH v1 02/15]:
--lcores='1,2@(5-7),(3-5)@(0,2),(0,6)' means starting 7 EAL thread as below
lcore 0 runs on cpuset 0x41 (cpu 0,6)
lcore 1 runs on cpuset 0x2 (cpu 1)
lcore 2 runs on cpuset 0xe0 (cpu 5,6,7)
lcore 3,4,5 runs on cpuset 0x5 (cpu 0,2)
lcore 6 runs on cpuset 0x41 (cpu 0,6)

2) Non-EAL (dynamically created threads) will have their  _lcore_id set to LCORE_ID_ANY (==UINT32_MAX).
That allow us to distinguish them from EAL threads (lcores).
Make changes in DPDK functions, so that they will safely work within such threads too, but there could be some performance 
(rte_mempool get/put)  degradation comaring to EAL thread.  

> 
> The another problem is that we compare here int with some unique thread identifier.

We are not doing that here.
_lcore_id is unsigned int.

> How can you prevent that when implementation of gettid will change and unique thread identifier will be
> Less than RTE_MAX_LCORE and will be still unique.

We are not using gettid() to assign _lcore_id values.
And never did.
Please read the patch set carefully.

> 
> I think that your assumption will work for well-known operating systems but will be very unportable.
> 
> Regarding performance the DPDK can work efficiently in different environments including pthreads.
> You can imagine running DPDK from pthread application where affinity will be made by application.
> Effectiveness depends on application thread implementation comparable to EAL threads.

That was one of the goals of the patch: make all DPDK API to work with dynamically created threads.
So now it is safe to do:

static void *thread_func1(void *arg)
{
    rte_pktmbuf_free((struct rte_mbuf *)arg);
   return NULL;
}

pthread_t tid;
struct rte_mbuf *m = rte_pktmbuf_alloc(mp);
 
pthread_create(&tid, NULL, thread_func1, m);


As Steve pointed - if you are dependent on rte_mempool get/put performance inside your threads -
better to use EAL threads.
With current patch you can have up to RTE_MAX_LCORE such threads and can assign any cpu affinity for each of them.
Though, if you have an idea how to extend mempool cache to any number of dynamically created threads
(make cache for each mempool sort of TLS?), then sure we can discuss it.
But I suppose it should be a metter of separate patch.    

Konstantin

> 
> I think that this is a goal for this change.
> 
> > >
> > > Same comment for get operation below
> > >
> > > > +	if (unlikely(cache_size == 0 || is_mp == 0 ||
> > > > +		     lcore_id >= RTE_MAX_LCORE))
> > > >  		goto ring_enqueue;
> > > >
> > > >  	/* Go straight to ring if put would overflow mem allocated for cache
> > > > */
> > > > @@ -952,7 +955,8 @@ __mempool_get_bulk(struct rte_mempool *mp,
> > void
> > > > **obj_table,
> > > >  	uint32_t cache_size = mp->cache_size;
> > > >
> > > >  	/* cache is not enabled or single consumer */
> > > > -	if (unlikely(cache_size == 0 || is_mc == 0 || n >= cache_size))
> > > > +	if (unlikely(cache_size == 0 || is_mc == 0 ||
> > > > +		     n >= cache_size || lcore_id >= RTE_MAX_LCORE))
> > > >  		goto ring_dequeue;
> > > >
> > > >  	cache = &mp->local_cache[lcore_id];
> > > > --
> > > > 1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v1 00/15] support multi-pthread per core
  2015-01-22  8:16 ` [dpdk-dev] [PATCH v1 00/15] support multi-pthread per core Cunming Liang
                     ` (14 preceding siblings ...)
  2015-01-22  8:16   ` [dpdk-dev] [PATCH v1 15/15] timer: " Cunming Liang
@ 2015-01-22 14:14   ` Ananyev, Konstantin
  2015-01-28  6:59   ` [dpdk-dev] [PATCH v2 " Cunming Liang
  16 siblings, 0 replies; 253+ messages in thread
From: Ananyev, Konstantin @ 2015-01-22 14:14 UTC (permalink / raw)
  To: Liang, Cunming, dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cunming Liang
> Sent: Thursday, January 22, 2015 8:16 AM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v1 00/15] support multi-pthread per core
> 
> The patch series contain the enhancements of EAL and fixes for libraries
> to run multi-pthreads(either EAL or non-EAL thread) per physical core.
> Two major changes list as below:
> - Extend the core affinity of each EAL thread to 1:n.
>   Each lcore stands for a EAL thread rather than a logical core.
>   The change adds new EAL option to allow static lcore to cpuset assginment.
>   Then a lcore(EAL thread) affinity to a cpuset, original 1:1 mapping is the special case.
> - Fix the libraries to allow running on any non-EAL thread.
>   It fix the gaps running libraries in non-EAL thread(dynamic created by user).
>   Each fix libraries take care the case of rte_lcore_id() >= RTE_MAX_LCORE.
> 
> Thanks a million for the comments from Konstantin, Bruce, Mirek and Stephen in RFC review.
> 
> *** BLURB HERE ***
> 
> Cunming Liang (15):
>   eal: add cpuset into per EAL thread lcore_config
>   eal: new eal option '--lcores' for cpu assignment
>   eal: add support parsing socket_id from cpuset
>   eal: new TLS definition and API declaration
>   eal: add eal_common_thread.c for common thread API
>   eal: add rte_gettid() to acquire unique system tid
>   eal: apply affinity of EAL thread by assigned cpuset
>   enic: fix re-define freebsd compile complain
>   malloc: fix the issue of SOCKET_ID_ANY
>   log: fix the gap to support non-EAL thread
>   eal: set _lcore_id and _socket_id to (-1) by default
>   eal: fix recursive spinlock in non-EAL thraed
>   mempool: add support to non-EAL thread
>   ring: add support to non-EAL thread
>   timer: add support to non-EAL thread
> 
>  lib/librte_eal/bsdapp/eal/Makefile                 |   1 +
>  lib/librte_eal/bsdapp/eal/eal.c                    |  13 +-
>  lib/librte_eal/bsdapp/eal/eal_lcore.c              |  14 ++
>  lib/librte_eal/bsdapp/eal/eal_memory.c             |   2 +
>  lib/librte_eal/bsdapp/eal/eal_thread.c             |  76 +++---
>  lib/librte_eal/common/eal_common_launch.c          |   1 -
>  lib/librte_eal/common/eal_common_log.c             |  17 +-
>  lib/librte_eal/common/eal_common_options.c         | 262 ++++++++++++++++++++-
>  lib/librte_eal/common/eal_common_thread.c          | 142 +++++++++++
>  lib/librte_eal/common/eal_options.h                |   2 +
>  lib/librte_eal/common/eal_thread.h                 |  66 ++++++
>  .../common/include/generic/rte_spinlock.h          |   4 +-
>  lib/librte_eal/common/include/rte_eal.h            |  27 +++
>  lib/librte_eal/common/include/rte_lcore.h          |  37 ++-
>  lib/librte_eal/common/include/rte_log.h            |   5 +
>  lib/librte_eal/linuxapp/eal/Makefile               |   4 +
>  lib/librte_eal/linuxapp/eal/eal.c                  |   7 +-
>  lib/librte_eal/linuxapp/eal/eal_lcore.c            |  15 ++
>  lib/librte_eal/linuxapp/eal/eal_thread.c           |  78 +++---
>  lib/librte_malloc/malloc_heap.h                    |   7 +-
>  lib/librte_mempool/rte_mempool.h                   |  18 +-
>  lib/librte_pmd_enic/enic.h                         |   1 +
>  lib/librte_pmd_enic/enic_compat.h                  |   1 +
>  lib/librte_ring/rte_ring.h                         |  10 +-
>  lib/librte_timer/rte_timer.c                       |  40 +++-
>  lib/librte_timer/rte_timer.h                       |   2 +-
>  26 files changed, 721 insertions(+), 131 deletions(-)
>  create mode 100644 lib/librte_eal/common/eal_common_thread.c
> 
> --

Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

> 1.8.1.4


^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v1 02/15] eal: new eal option '--lcores' for cpu assignment
  2015-01-22 12:19     ` Bruce Richardson
@ 2015-01-22 14:34       ` Ananyev, Konstantin
  2015-01-22 15:17         ` Wodkowski, PawelX
  2015-01-22 15:23         ` Bruce Richardson
  2015-01-23  0:39       ` Liang, Cunming
  1 sibling, 2 replies; 253+ messages in thread
From: Ananyev, Konstantin @ 2015-01-22 14:34 UTC (permalink / raw)
  To: Richardson, Bruce, Liang, Cunming; +Cc: dev

Hi Bruce,

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Bruce Richardson
> Sent: Thursday, January 22, 2015 12:19 PM
> To: Liang, Cunming
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v1 02/15] eal: new eal option '--lcores' for cpu assignment
> 
> On Thu, Jan 22, 2015 at 04:16:25PM +0800, Cunming Liang wrote:
> > It supports one new eal long option '--lcores' for EAL thread cpuset assignment.
> >
> > The format pattern:
> > 	--lcores='lcores[@cpus]<,lcores[@cpus]>'
> > lcores, cpus could be a single digit or a group.
> > '(' and ')' are necessary if it's a group.
> > If not supply '@cpus', the value of cpus uses the same as lcores.
> >
> > e.g. '1,2@(5-7),(3-5)@(0,2),(0,6)' means starting 7 EAL thread as below
> >   lcore 0 runs on cpuset 0x41 (cpu 0,6)
> >   lcore 1 runs on cpuset 0x2 (cpu 1)
> >   lcore 2 runs on cpuset 0xe0 (cpu 5,6,7)
> >   lcore 3,4,5 runs on cpuset 0x5 (cpu 0,2)
> >   lcore 6 runs on cpuset 0x41 (cpu 0,6)
> >
> 
> This strikes me as very confusing, though a couple of tweaks might help with
> readability. The lcore 0 at the end is especially confusing.

Didn't get you here: do you find (0,6) confusing, right?
Because braces implicitly specifies affinity for group of en-braced lcores? 

> Perhaps we can
> limit the allowed formats here,
> * require the lcore_id to be specified - the lack of an lcore id for the last part
> makes having it as lcore 0 surprising.

Again, not sure I understand you properly:  lcore_id(s) are always specified explicitly. 
Physical cpus part might be omitted.

> * only allow one lcore id to be given for each set of cores.

So you mean for '(3-5)@(0,2)' user would have to: '3@(0,2),4@(0,2),5@(0,2)'?
I don't see big difference here, but imagine you'd like to create a pool of 32 EAL-threads running on same cpu set.
With current syntax it is just something like: '(32-63)@(0-7)'.
With what you proposing it will be a very long list.  

> 
> I think it may still be readable if we allow the core set to be omitted if its
> to be the same as the lcore_id.

I think that is supported.
See lcore_id=1 in Steve's example above.
As I understand: --lcores='0,2,3-5' is equal to '-l 0,2,3-5' and to '-c 0x3d'.

Konstantin

> 
> It's probably still not going to be very tidy, but I think we can improve things.
> 
> /Bruce
> 
> > Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> > ---
> >  lib/librte_eal/common/eal_common_launch.c  |   1 -
> >  lib/librte_eal/common/eal_common_options.c | 262 ++++++++++++++++++++++++++++-
> >  lib/librte_eal/common/eal_options.h        |   2 +
> >  lib/librte_eal/linuxapp/eal/Makefile       |   1 +
> >  4 files changed, 261 insertions(+), 5 deletions(-)
> >
> > diff --git a/lib/librte_eal/common/eal_common_launch.c b/lib/librte_eal/common/eal_common_launch.c
> > index 599f83b..2d732b1 100644
> > --- a/lib/librte_eal/common/eal_common_launch.c
> > +++ b/lib/librte_eal/common/eal_common_launch.c
> > @@ -117,4 +117,3 @@ rte_eal_mp_wait_lcore(void)
> >  		rte_eal_wait_lcore(lcore_id);
> >  	}
> >  }
> > -
> > diff --git a/lib/librte_eal/common/eal_common_options.c b/lib/librte_eal/common/eal_common_options.c
> > index e2810ab..fc47588 100644
> > --- a/lib/librte_eal/common/eal_common_options.c
> > +++ b/lib/librte_eal/common/eal_common_options.c
> > @@ -45,6 +45,7 @@
> >  #include <rte_lcore.h>
> >  #include <rte_version.h>
> >  #include <rte_devargs.h>
> > +#include <rte_memcpy.h>
> >
> >  #include "eal_internal_cfg.h"
> >  #include "eal_options.h"
> > @@ -85,6 +86,7 @@ eal_long_options[] = {
> >  	{OPT_XEN_DOM0, 0, 0, OPT_XEN_DOM0_NUM},
> >  	{OPT_CREATE_UIO_DEV, 1, NULL, OPT_CREATE_UIO_DEV_NUM},
> >  	{OPT_VFIO_INTR, 1, NULL, OPT_VFIO_INTR_NUM},
> > +	{OPT_LCORES, 1, 0, OPT_LCORES_NUM},
> >  	{0, 0, 0, 0}
> >  };
> >
> > @@ -255,9 +257,11 @@ eal_parse_corelist(const char *corelist)
> >  			if (min == RTE_MAX_LCORE)
> >  				min = idx;
> >  			for (idx = min; idx <= max; idx++) {
> > -				cfg->lcore_role[idx] = ROLE_RTE;
> > -				lcore_config[idx].core_index = count;
> > -				count++;
> > +				if (cfg->lcore_role[idx] != ROLE_RTE) {
> > +					cfg->lcore_role[idx] = ROLE_RTE;
> > +					lcore_config[idx].core_index = count;
> > +					count++;
> > +				}
> >  			}
> >  			min = RTE_MAX_LCORE;
> >  		} else
> > @@ -289,6 +293,241 @@ eal_parse_master_lcore(const char *arg)
> >  	return 0;
> >  }
> >
> > +/*
> > + * Parse elem, the elem could be single number or '(' ')' group
> > + * Within group elem, '-' used for a range seperator;
> > + *                    ',' used for a single number.
> > + */
> > +static int
> > +eal_parse_set(const char *input, uint16_t set[], unsigned num)
> > +{
> > +	unsigned idx;
> > +	const char *str = input;
> > +	char *end = NULL;
> > +	unsigned min, max;
> > +
> > +	memset(set, 0, num * sizeof(uint16_t));
> > +
> > +	while (isblank(*str))
> > +		str++;
> > +
> > +	/* only digit or left bracket is qulify for start point */
> > +	if ((!isdigit(*str) && *str != '(') || *str == '\0')
> > +		return -1;
> > +
> > +	/* process single number */
> > +	if (*str != '(') {
> > +		errno = 0;
> > +		idx = strtoul(str, &end, 10);
> > +		if (errno || end == NULL || idx >= num)
> > +			return -1;
> > +		else {
> > +			while (isblank(*end))
> > +				end++;
> > +
> > +			if (*end != ',' && *end != '\0' &&
> > +			    *end != '@')
> > +				return -1;
> > +
> > +			set[idx] = 1;
> > +			return end - input;
> > +		}
> > +	}
> > +
> > +	/* process set within bracket */
> > +	str++;
> > +	while (isblank(*str))
> > +		str++;
> > +	if (*str == '\0')
> > +		return -1;
> > +
> > +	min = RTE_MAX_LCORE;
> > +	do {
> > +
> > +		/* go ahead to the first digit */
> > +		while (isblank(*str))
> > +			str++;
> > +		if (!isdigit(*str))
> > +			return -1;
> > +
> > +		/* get the digit value */
> > +		errno = 0;
> > +		idx = strtoul(str, &end, 10);
> > +		if (errno || end == NULL || idx >= num)
> > +			return -1;
> > +
> > +		/* go ahead to separator '-',',' and ')' */
> > +		while (isblank(*end))
> > +			end++;
> > +		if (*end == '-') {
> > +			if (min == RTE_MAX_LCORE)
> > +				min = idx;
> > +			else /* avoid continuous '-' */
> > +				return -1;
> > +		} else if ((*end == ',') || (*end == ')')) {
> > +			max = idx;
> > +			if (min == RTE_MAX_LCORE)
> > +				min = idx;
> > +			for (idx = RTE_MIN(min, max);
> > +			     idx <= RTE_MAX(min, max); idx++)
> > +				set[idx] = 1;
> > +
> > +			min = RTE_MAX_LCORE;
> > +		} else
> > +			return -1;
> > +
> > +		str = end + 1;
> > +	} while (*end != '\0' && *end != ')');
> > +
> > +	return str - input;
> > +}
> > +
> > +/* convert from set array to cpuset bitmap */
> > +static inline int
> > +convert_to_cpuset(rte_cpuset_t *cpusetp,
> > +	      uint16_t *set, unsigned num)
> > +{
> > +	unsigned idx;
> > +
> > +	CPU_ZERO(cpusetp);
> > +
> > +	for (idx = 0; idx < num; idx++) {
> > +		if (!set[idx])
> > +			continue;
> > +
> > +		if (!lcore_config[idx].detected) {
> > +			RTE_LOG(ERR, EAL, "core %u "
> > +				"unavailable\n", idx);
> > +			return -1;
> > +		}
> > +
> > +		CPU_SET(idx, cpusetp);
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +/*
> > + * The format pattern: --lcores='lcores[@cpus]<,lcores[@cpus]>'
> > + * lcores, cpus could be a single digit or a group.
> > + * '(' and ')' are necessary if it's a group.
> > + * If not supply '@cpus', the value of cpus uses the same as lcores.
> > + * e.g. '1,2@(5-7),(3-5)@(0,2),(0,6)' means start 7 EAL thread as below
> > + *   lcore 0 runs on cpuset 0x41 (cpu 0,6)
> > + *   lcore 1 runs on cpuset 0x2 (cpu 1)
> > + *   lcore 2 runs on cpuset 0xe0 (cpu 5,6,7)
> > + *   lcore 3,4,5 runs on cpuset 0x5 (cpu 0,2)
> > + *   lcore 6 runs on cpuset 0x41 (cpu 0,6)
> > + */
> > +static int
> > +eal_parse_lcores(const char *lcores)
> > +{
> > +	struct rte_config *cfg = rte_eal_get_configuration();
> > +	static uint16_t set[RTE_MAX_LCORE];
> > +	unsigned idx = 0;
> > +	int i;
> > +	unsigned count = 0;
> > +	const char *lcore_start = NULL;
> > +	const char *end = NULL;
> > +	int offset;
> > +	rte_cpuset_t cpuset;
> > +	int ret = -1;
> > +
> > +	if (lcores == NULL)
> > +		return -1;
> > +
> > +	/* Remove all blank characters ahead and after */
> > +	while (isblank(*lcores))
> > +		lcores++;
> > +	i = strnlen(lcores, sysconf(_SC_ARG_MAX));
> > +	while ((i > 0) && isblank(lcores[i - 1]))
> > +		i--;
> > +
> > +	CPU_ZERO(&cpuset);
> > +
> > +	/* Reset lcore config */
> > +	for (idx = 0; idx < RTE_MAX_LCORE; idx++) {
> > +		cfg->lcore_role[idx] = ROLE_OFF;
> > +		lcore_config[idx].core_index = -1;
> > +		CPU_ZERO(&lcore_config[idx].cpuset);
> > +	}
> > +
> > +	/* Get list of cores */
> > +	do {
> > +		while (isblank(*lcores))
> > +			lcores++;
> > +		if (*lcores == '\0')
> > +			goto err;
> > +
> > +		/* record lcore_set start point */
> > +		lcore_start = lcores;
> > +
> > +		/* go across a complete bracket */
> > +		if (*lcore_start == '(') {
> > +			lcores += strcspn(lcores, ")");
> > +			if (*lcores++ == '\0')
> > +				goto err;
> > +		}
> > +
> > +		/* scan the separator '@', ','(next) or '\0'(finish) */
> > +		lcores += strcspn(lcores, "@,");
> > +
> > +		if (*lcores == '@') {
> > +			/* explict assign cpu_set */
> > +			offset = eal_parse_set(lcores + 1, set, RTE_DIM(set));
> > +			if (offset < 0)
> > +				goto err;
> > +
> > +			/* prepare cpu_set and update the end cursor */
> > +			if (0 > convert_to_cpuset(&cpuset,
> > +						  set, RTE_DIM(set)))
> > +				goto err;
> > +			end = lcores + 1 + offset;
> > +		} else  /* ',' or '\0' */
> > +			/* haven't given cpu_set, current loop done */
> > +			end = lcores;
> > +
> > +		if (*end != ',' && *end != '\0')
> > +			goto err;
> > +
> > +		/* parse lcore_set from start point */
> > +		if (0 > eal_parse_set(lcore_start, set, RTE_DIM(set)))
> > +			goto err;
> > +
> > +		/* without '@', by default using lcore_set as cpu_set */
> > +		if (*lcores != '@' &&
> > +		    0 > convert_to_cpuset(&cpuset, set, RTE_DIM(set)))
> > +			goto err;
> > +
> > +		/* start to update lcore_set */
> > +		for (idx = 0; idx < RTE_MAX_LCORE; idx++) {
> > +			if (!set[idx])
> > +				continue;
> > +
> > +			if (cfg->lcore_role[idx] != ROLE_RTE) {
> > +				lcore_config[idx].core_index = count;
> > +				cfg->lcore_role[idx] = ROLE_RTE;
> > +				count++;
> > +			}
> > +			rte_memcpy(&lcore_config[idx].cpuset, &cpuset,
> > +				   sizeof(rte_cpuset_t));
> > +		}
> > +
> > +		lcores = end + 1;
> > +	} while (*end != '\0');
> > +
> > +	if (count == 0)
> > +		goto err;
> > +
> > +	cfg->lcore_count = count;
> > +	lcores_parsed = 1;
> > +	ret = 0;
> > +
> > +err:
> > +
> > +	return ret;
> > +}
> > +
> >  static int
> >  eal_parse_syslog(const char *facility, struct internal_config *conf)
> >  {
> > @@ -489,6 +728,13 @@ eal_parse_common_option(int opt, const char *optarg,
> >  		conf->log_level = log;
> >  		break;
> >  	}
> > +	case OPT_LCORES_NUM:
> > +		if (eal_parse_lcores(optarg) < 0) {
> > +			RTE_LOG(ERR, EAL, "invalid parameter for --"
> > +				OPT_LCORES "\n");
> > +			return -1;
> > +		}
> > +		break;
> >
> >  	/* don't know what to do, leave this to caller */
> >  	default:
> > @@ -527,7 +773,7 @@ eal_check_common_options(struct internal_config *internal_cfg)
> >
> >  	if (!lcores_parsed) {
> >  		RTE_LOG(ERR, EAL, "CPU cores must be enabled with options "
> > -			"-c or -l\n");
> > +			"-c, -l or --lcores\n");
> >  		return -1;
> >  	}
> >  	if (cfg->lcore_role[cfg->master_lcore] != ROLE_RTE) {
> > @@ -583,6 +829,14 @@ eal_common_usage(void)
> >  	       "                 The argument format is <c1>[-c2][,c3[-c4],...]\n"
> >  	       "                 where c1, c2, etc are core indexes between 0 and %d\n"
> >  	       "  --"OPT_MASTER_LCORE" ID: Core ID that is used as master\n"
> > +	       "  --"OPT_LCORES" MAP: maps between lcore_set to phys_cpu_set\n"
> > +	       "                 The argument format is\n"
> > +	       "                       'lcores[@cpus]<,lcores[@cpus],...>'\n"
> > +	       "                 lcores and cpus list are grouped by '(' and ')'\n"
> > +	       "                 Within the group, '-' is used for range separator,\n"
> > +	       "                 ',' is used for single number separator.\n"
> > +	       "                 '( )' can be omitted for single element group, '@' \n"
> > +	       "                 can be omitted if cpus and lcores has the same value\n"
> >  	       "  -n NUM       : Number of memory channels\n"
> >  	       "  -v           : Display version information on startup\n"
> >  	       "  -m MB        : memory to allocate (see also --"OPT_SOCKET_MEM")\n"
> > diff --git a/lib/librte_eal/common/eal_options.h b/lib/librte_eal/common/eal_options.h
> > index e476f8d..a1cc59f 100644
> > --- a/lib/librte_eal/common/eal_options.h
> > +++ b/lib/librte_eal/common/eal_options.h
> > @@ -77,6 +77,8 @@ enum {
> >  	OPT_CREATE_UIO_DEV_NUM,
> >  #define OPT_VFIO_INTR    "vfio-intr"
> >  	OPT_VFIO_INTR_NUM,
> > +#define OPT_LCORES "lcores"
> > +	OPT_LCORES_NUM,
> >  	OPT_LONG_MAX_NUM
> >  };
> >
> > diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile
> > index 0e9c447..025d836 100644
> > --- a/lib/librte_eal/linuxapp/eal/Makefile
> > +++ b/lib/librte_eal/linuxapp/eal/Makefile
> > @@ -95,6 +95,7 @@ CFLAGS_eal_hugepage_info.o := -D_GNU_SOURCE
> >  CFLAGS_eal_pci.o := -D_GNU_SOURCE
> >  CFLAGS_eal_pci_vfio.o := -D_GNU_SOURCE
> >  CFLAGS_eal_common_whitelist.o := -D_GNU_SOURCE
> > +CFLAGS_eal_common_options.o := -D_GNU_SOURCE
> >
> >  # workaround for a gcc bug with noreturn attribute
> >  # http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
> > --
> > 1.8.1.4
> >

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v1 02/15] eal: new eal option '--lcores' for cpu assignment
  2015-01-22 14:34       ` Ananyev, Konstantin
@ 2015-01-22 15:17         ` Wodkowski, PawelX
  2015-01-25 15:34           ` Liang, Cunming
  2015-01-22 15:23         ` Bruce Richardson
  1 sibling, 1 reply; 253+ messages in thread
From: Wodkowski, PawelX @ 2015-01-22 15:17 UTC (permalink / raw)
  To: Ananyev, Konstantin, Richardson, Bruce, Liang, Cunming; +Cc: dev

Hi,
I want to mention that similar but for me much more readable syntax have 
Pktgen-DPDK for defining core - port mapping. Maybe we can adopt this syntax
for new '--lcores' parameter.

See '-m' parameter syntax on Pktgen readme.
https://github.com/pktgen/Pktgen-DPDK/blob/master/dpdk/examples/pktgen/README.md

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ananyev, Konstantin
> Sent: Thursday, January 22, 2015 3:34 PM
> To: Richardson, Bruce; Liang, Cunming
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v1 02/15] eal: new eal option '--lcores' for cpu
> assignment
> 
> Hi Bruce,
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Bruce Richardson
> > Sent: Thursday, January 22, 2015 12:19 PM
> > To: Liang, Cunming
> > Cc: dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v1 02/15] eal: new eal option '--lcores' for cpu
> assignment
> >
> > On Thu, Jan 22, 2015 at 04:16:25PM +0800, Cunming Liang wrote:
> > > It supports one new eal long option '--lcores' for EAL thread cpuset
> assignment.
> > >
> > > The format pattern:
> > > 	--lcores='lcores[@cpus]<,lcores[@cpus]>'
> > > lcores, cpus could be a single digit or a group.
> > > '(' and ')' are necessary if it's a group.
> > > If not supply '@cpus', the value of cpus uses the same as lcores.
> > >
> > > e.g. '1,2@(5-7),(3-5)@(0,2),(0,6)' means starting 7 EAL thread as below
> > >   lcore 0 runs on cpuset 0x41 (cpu 0,6)
> > >   lcore 1 runs on cpuset 0x2 (cpu 1)
> > >   lcore 2 runs on cpuset 0xe0 (cpu 5,6,7)
> > >   lcore 3,4,5 runs on cpuset 0x5 (cpu 0,2)
> > >   lcore 6 runs on cpuset 0x41 (cpu 0,6)
> > >
> >
> > This strikes me as very confusing, though a couple of tweaks might help with
> > readability. The lcore 0 at the end is especially confusing.
> 
> Didn't get you here: do you find (0,6) confusing, right?
> Because braces implicitly specifies affinity for group of en-braced lcores?
> 
> > Perhaps we can
> > limit the allowed formats here,
> > * require the lcore_id to be specified - the lack of an lcore id for the last part
> > makes having it as lcore 0 surprising.
> 
> Again, not sure I understand you properly:  lcore_id(s) are always specified
> explicitly.
> Physical cpus part might be omitted.
> 
> > * only allow one lcore id to be given for each set of cores.
> 
> So you mean for '(3-5)@(0,2)' user would have to: '3@(0,2),4@(0,2),5@(0,2)'?
> I don't see big difference here, but imagine you'd like to create a pool of 32 EAL-
> threads running on same cpu set.
> With current syntax it is just something like: '(32-63)@(0-7)'.
> With what you proposing it will be a very long list.
> 
> >
> > I think it may still be readable if we allow the core set to be omitted if its
> > to be the same as the lcore_id.
> 
> I think that is supported.
> See lcore_id=1 in Steve's example above.
> As I understand: --lcores='0,2,3-5' is equal to '-l 0,2,3-5' and to '-c 0x3d'.
> 
> Konstantin
> 
> >
> > It's probably still not going to be very tidy, but I think we can improve things.
> >
> > /Bruce
> >
> > > Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> > > ---
> > >  lib/librte_eal/common/eal_common_launch.c  |   1 -
> > >  lib/librte_eal/common/eal_common_options.c | 262
> ++++++++++++++++++++++++++++-
> > >  lib/librte_eal/common/eal_options.h        |   2 +
> > >  lib/librte_eal/linuxapp/eal/Makefile       |   1 +
> > >  4 files changed, 261 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/lib/librte_eal/common/eal_common_launch.c
> b/lib/librte_eal/common/eal_common_launch.c
> > > index 599f83b..2d732b1 100644
> > > --- a/lib/librte_eal/common/eal_common_launch.c
> > > +++ b/lib/librte_eal/common/eal_common_launch.c
> > > @@ -117,4 +117,3 @@ rte_eal_mp_wait_lcore(void)
> > >  		rte_eal_wait_lcore(lcore_id);
> > >  	}
> > >  }
> > > -
> > > diff --git a/lib/librte_eal/common/eal_common_options.c
> b/lib/librte_eal/common/eal_common_options.c
> > > index e2810ab..fc47588 100644
> > > --- a/lib/librte_eal/common/eal_common_options.c
> > > +++ b/lib/librte_eal/common/eal_common_options.c
> > > @@ -45,6 +45,7 @@
> > >  #include <rte_lcore.h>
> > >  #include <rte_version.h>
> > >  #include <rte_devargs.h>
> > > +#include <rte_memcpy.h>
> > >
> > >  #include "eal_internal_cfg.h"
> > >  #include "eal_options.h"
> > > @@ -85,6 +86,7 @@ eal_long_options[] = {
> > >  	{OPT_XEN_DOM0, 0, 0, OPT_XEN_DOM0_NUM},
> > >  	{OPT_CREATE_UIO_DEV, 1, NULL, OPT_CREATE_UIO_DEV_NUM},
> > >  	{OPT_VFIO_INTR, 1, NULL, OPT_VFIO_INTR_NUM},
> > > +	{OPT_LCORES, 1, 0, OPT_LCORES_NUM},
> > >  	{0, 0, 0, 0}
> > >  };
> > >
> > > @@ -255,9 +257,11 @@ eal_parse_corelist(const char *corelist)
> > >  			if (min == RTE_MAX_LCORE)
> > >  				min = idx;
> > >  			for (idx = min; idx <= max; idx++) {
> > > -				cfg->lcore_role[idx] = ROLE_RTE;
> > > -				lcore_config[idx].core_index = count;
> > > -				count++;
> > > +				if (cfg->lcore_role[idx] != ROLE_RTE) {
> > > +					cfg->lcore_role[idx] = ROLE_RTE;
> > > +					lcore_config[idx].core_index = count;
> > > +					count++;
> > > +				}
> > >  			}
> > >  			min = RTE_MAX_LCORE;
> > >  		} else
> > > @@ -289,6 +293,241 @@ eal_parse_master_lcore(const char *arg)
> > >  	return 0;
> > >  }
> > >
> > > +/*
> > > + * Parse elem, the elem could be single number or '(' ')' group
> > > + * Within group elem, '-' used for a range seperator;
> > > + *                    ',' used for a single number.
> > > + */
> > > +static int
> > > +eal_parse_set(const char *input, uint16_t set[], unsigned num)
> > > +{
> > > +	unsigned idx;
> > > +	const char *str = input;
> > > +	char *end = NULL;
> > > +	unsigned min, max;
> > > +
> > > +	memset(set, 0, num * sizeof(uint16_t));
> > > +
> > > +	while (isblank(*str))
> > > +		str++;
> > > +
> > > +	/* only digit or left bracket is qulify for start point */
> > > +	if ((!isdigit(*str) && *str != '(') || *str == '\0')
> > > +		return -1;
> > > +
> > > +	/* process single number */
> > > +	if (*str != '(') {
> > > +		errno = 0;
> > > +		idx = strtoul(str, &end, 10);
> > > +		if (errno || end == NULL || idx >= num)
> > > +			return -1;
> > > +		else {
> > > +			while (isblank(*end))
> > > +				end++;
> > > +
> > > +			if (*end != ',' && *end != '\0' &&
> > > +			    *end != '@')
> > > +				return -1;
> > > +
> > > +			set[idx] = 1;
> > > +			return end - input;
> > > +		}
> > > +	}
> > > +
> > > +	/* process set within bracket */
> > > +	str++;
> > > +	while (isblank(*str))
> > > +		str++;
> > > +	if (*str == '\0')
> > > +		return -1;
> > > +
> > > +	min = RTE_MAX_LCORE;
> > > +	do {
> > > +
> > > +		/* go ahead to the first digit */
> > > +		while (isblank(*str))
> > > +			str++;
> > > +		if (!isdigit(*str))
> > > +			return -1;
> > > +
> > > +		/* get the digit value */
> > > +		errno = 0;
> > > +		idx = strtoul(str, &end, 10);
> > > +		if (errno || end == NULL || idx >= num)
> > > +			return -1;
> > > +
> > > +		/* go ahead to separator '-',',' and ')' */
> > > +		while (isblank(*end))
> > > +			end++;
> > > +		if (*end == '-') {
> > > +			if (min == RTE_MAX_LCORE)
> > > +				min = idx;
> > > +			else /* avoid continuous '-' */
> > > +				return -1;
> > > +		} else if ((*end == ',') || (*end == ')')) {
> > > +			max = idx;
> > > +			if (min == RTE_MAX_LCORE)
> > > +				min = idx;
> > > +			for (idx = RTE_MIN(min, max);
> > > +			     idx <= RTE_MAX(min, max); idx++)
> > > +				set[idx] = 1;
> > > +
> > > +			min = RTE_MAX_LCORE;
> > > +		} else
> > > +			return -1;
> > > +
> > > +		str = end + 1;
> > > +	} while (*end != '\0' && *end != ')');
> > > +
> > > +	return str - input;
> > > +}
> > > +
> > > +/* convert from set array to cpuset bitmap */
> > > +static inline int
> > > +convert_to_cpuset(rte_cpuset_t *cpusetp,
> > > +	      uint16_t *set, unsigned num)
> > > +{
> > > +	unsigned idx;
> > > +
> > > +	CPU_ZERO(cpusetp);
> > > +
> > > +	for (idx = 0; idx < num; idx++) {
> > > +		if (!set[idx])
> > > +			continue;
> > > +
> > > +		if (!lcore_config[idx].detected) {
> > > +			RTE_LOG(ERR, EAL, "core %u "
> > > +				"unavailable\n", idx);
> > > +			return -1;
> > > +		}
> > > +
> > > +		CPU_SET(idx, cpusetp);
> > > +	}
> > > +
> > > +	return 0;
> > > +}
> > > +
> > > +/*
> > > + * The format pattern: --lcores='lcores[@cpus]<,lcores[@cpus]>'
> > > + * lcores, cpus could be a single digit or a group.
> > > + * '(' and ')' are necessary if it's a group.
> > > + * If not supply '@cpus', the value of cpus uses the same as lcores.
> > > + * e.g. '1,2@(5-7),(3-5)@(0,2),(0,6)' means start 7 EAL thread as below
> > > + *   lcore 0 runs on cpuset 0x41 (cpu 0,6)
> > > + *   lcore 1 runs on cpuset 0x2 (cpu 1)
> > > + *   lcore 2 runs on cpuset 0xe0 (cpu 5,6,7)
> > > + *   lcore 3,4,5 runs on cpuset 0x5 (cpu 0,2)
> > > + *   lcore 6 runs on cpuset 0x41 (cpu 0,6)
> > > + */
> > > +static int
> > > +eal_parse_lcores(const char *lcores)
> > > +{
> > > +	struct rte_config *cfg = rte_eal_get_configuration();
> > > +	static uint16_t set[RTE_MAX_LCORE];
> > > +	unsigned idx = 0;
> > > +	int i;
> > > +	unsigned count = 0;
> > > +	const char *lcore_start = NULL;
> > > +	const char *end = NULL;
> > > +	int offset;
> > > +	rte_cpuset_t cpuset;
> > > +	int ret = -1;
> > > +
> > > +	if (lcores == NULL)
> > > +		return -1;
> > > +
> > > +	/* Remove all blank characters ahead and after */
> > > +	while (isblank(*lcores))
> > > +		lcores++;
> > > +	i = strnlen(lcores, sysconf(_SC_ARG_MAX));
> > > +	while ((i > 0) && isblank(lcores[i - 1]))
> > > +		i--;
> > > +
> > > +	CPU_ZERO(&cpuset);
> > > +
> > > +	/* Reset lcore config */
> > > +	for (idx = 0; idx < RTE_MAX_LCORE; idx++) {
> > > +		cfg->lcore_role[idx] = ROLE_OFF;
> > > +		lcore_config[idx].core_index = -1;
> > > +		CPU_ZERO(&lcore_config[idx].cpuset);
> > > +	}
> > > +
> > > +	/* Get list of cores */
> > > +	do {
> > > +		while (isblank(*lcores))
> > > +			lcores++;
> > > +		if (*lcores == '\0')
> > > +			goto err;
> > > +
> > > +		/* record lcore_set start point */
> > > +		lcore_start = lcores;
> > > +
> > > +		/* go across a complete bracket */
> > > +		if (*lcore_start == '(') {
> > > +			lcores += strcspn(lcores, ")");
> > > +			if (*lcores++ == '\0')
> > > +				goto err;
> > > +		}
> > > +
> > > +		/* scan the separator '@', ','(next) or '\0'(finish) */
> > > +		lcores += strcspn(lcores, "@,");
> > > +
> > > +		if (*lcores == '@') {
> > > +			/* explict assign cpu_set */
> > > +			offset = eal_parse_set(lcores + 1, set, RTE_DIM(set));
> > > +			if (offset < 0)
> > > +				goto err;
> > > +
> > > +			/* prepare cpu_set and update the end cursor */
> > > +			if (0 > convert_to_cpuset(&cpuset,
> > > +						  set, RTE_DIM(set)))
> > > +				goto err;
> > > +			end = lcores + 1 + offset;
> > > +		} else  /* ',' or '\0' */
> > > +			/* haven't given cpu_set, current loop done */
> > > +			end = lcores;
> > > +
> > > +		if (*end != ',' && *end != '\0')
> > > +			goto err;
> > > +
> > > +		/* parse lcore_set from start point */
> > > +		if (0 > eal_parse_set(lcore_start, set, RTE_DIM(set)))
> > > +			goto err;
> > > +
> > > +		/* without '@', by default using lcore_set as cpu_set */
> > > +		if (*lcores != '@' &&
> > > +		    0 > convert_to_cpuset(&cpuset, set, RTE_DIM(set)))
> > > +			goto err;
> > > +
> > > +		/* start to update lcore_set */
> > > +		for (idx = 0; idx < RTE_MAX_LCORE; idx++) {
> > > +			if (!set[idx])
> > > +				continue;
> > > +
> > > +			if (cfg->lcore_role[idx] != ROLE_RTE) {
> > > +				lcore_config[idx].core_index = count;
> > > +				cfg->lcore_role[idx] = ROLE_RTE;
> > > +				count++;
> > > +			}
> > > +			rte_memcpy(&lcore_config[idx].cpuset, &cpuset,
> > > +				   sizeof(rte_cpuset_t));
> > > +		}
> > > +
> > > +		lcores = end + 1;
> > > +	} while (*end != '\0');
> > > +
> > > +	if (count == 0)
> > > +		goto err;
> > > +
> > > +	cfg->lcore_count = count;
> > > +	lcores_parsed = 1;
> > > +	ret = 0;
> > > +
> > > +err:
> > > +
> > > +	return ret;
> > > +}
> > > +
> > >  static int
> > >  eal_parse_syslog(const char *facility, struct internal_config *conf)
> > >  {
> > > @@ -489,6 +728,13 @@ eal_parse_common_option(int opt, const char
> *optarg,
> > >  		conf->log_level = log;
> > >  		break;
> > >  	}
> > > +	case OPT_LCORES_NUM:
> > > +		if (eal_parse_lcores(optarg) < 0) {
> > > +			RTE_LOG(ERR, EAL, "invalid parameter for --"
> > > +				OPT_LCORES "\n");
> > > +			return -1;
> > > +		}
> > > +		break;
> > >
> > >  	/* don't know what to do, leave this to caller */
> > >  	default:
> > > @@ -527,7 +773,7 @@ eal_check_common_options(struct internal_config
> *internal_cfg)
> > >
> > >  	if (!lcores_parsed) {
> > >  		RTE_LOG(ERR, EAL, "CPU cores must be enabled with options "
> > > -			"-c or -l\n");
> > > +			"-c, -l or --lcores\n");
> > >  		return -1;
> > >  	}
> > >  	if (cfg->lcore_role[cfg->master_lcore] != ROLE_RTE) {
> > > @@ -583,6 +829,14 @@ eal_common_usage(void)
> > >  	       "                 The argument format is <c1>[-c2][,c3[-c4],...]\n"
> > >  	       "                 where c1, c2, etc are core indexes between 0 and %d\n"
> > >  	       "  --"OPT_MASTER_LCORE" ID: Core ID that is used as master\n"
> > > +	       "  --"OPT_LCORES" MAP: maps between lcore_set to
> phys_cpu_set\n"
> > > +	       "                 The argument format is\n"
> > > +	       "                       'lcores[@cpus]<,lcores[@cpus],...>'\n"
> > > +	       "                 lcores and cpus list are grouped by '(' and ')'\n"
> > > +	       "                 Within the group, '-' is used for range separator,\n"
> > > +	       "                 ',' is used for single number separator.\n"
> > > +	       "                 '( )' can be omitted for single element group, '@' \n"
> > > +	       "                 can be omitted if cpus and lcores has the same value\n"
> > >  	       "  -n NUM       : Number of memory channels\n"
> > >  	       "  -v           : Display version information on startup\n"
> > >  	       "  -m MB        : memory to allocate (see also --
> "OPT_SOCKET_MEM")\n"
> > > diff --git a/lib/librte_eal/common/eal_options.h
> b/lib/librte_eal/common/eal_options.h
> > > index e476f8d..a1cc59f 100644
> > > --- a/lib/librte_eal/common/eal_options.h
> > > +++ b/lib/librte_eal/common/eal_options.h
> > > @@ -77,6 +77,8 @@ enum {
> > >  	OPT_CREATE_UIO_DEV_NUM,
> > >  #define OPT_VFIO_INTR    "vfio-intr"
> > >  	OPT_VFIO_INTR_NUM,
> > > +#define OPT_LCORES "lcores"
> > > +	OPT_LCORES_NUM,
> > >  	OPT_LONG_MAX_NUM
> > >  };
> > >
> > > diff --git a/lib/librte_eal/linuxapp/eal/Makefile
> b/lib/librte_eal/linuxapp/eal/Makefile
> > > index 0e9c447..025d836 100644
> > > --- a/lib/librte_eal/linuxapp/eal/Makefile
> > > +++ b/lib/librte_eal/linuxapp/eal/Makefile
> > > @@ -95,6 +95,7 @@ CFLAGS_eal_hugepage_info.o := -D_GNU_SOURCE
> > >  CFLAGS_eal_pci.o := -D_GNU_SOURCE
> > >  CFLAGS_eal_pci_vfio.o := -D_GNU_SOURCE
> > >  CFLAGS_eal_common_whitelist.o := -D_GNU_SOURCE
> > > +CFLAGS_eal_common_options.o := -D_GNU_SOURCE
> > >
> > >  # workaround for a gcc bug with noreturn attribute
> > >  # http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
> > > --
> > > 1.8.1.4
> > >

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v1 02/15] eal: new eal option '--lcores' for cpu assignment
  2015-01-22 14:34       ` Ananyev, Konstantin
  2015-01-22 15:17         ` Wodkowski, PawelX
@ 2015-01-22 15:23         ` Bruce Richardson
  1 sibling, 0 replies; 253+ messages in thread
From: Bruce Richardson @ 2015-01-22 15:23 UTC (permalink / raw)
  To: Ananyev, Konstantin; +Cc: dev

On Thu, Jan 22, 2015 at 02:34:07PM +0000, Ananyev, Konstantin wrote:
> Hi Bruce,
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Bruce Richardson
> > Sent: Thursday, January 22, 2015 12:19 PM
> > To: Liang, Cunming
> > Cc: dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v1 02/15] eal: new eal option '--lcores' for cpu assignment
> > 
> > On Thu, Jan 22, 2015 at 04:16:25PM +0800, Cunming Liang wrote:
> > > It supports one new eal long option '--lcores' for EAL thread cpuset assignment.
> > >
> > > The format pattern:
> > > 	--lcores='lcores[@cpus]<,lcores[@cpus]>'
> > > lcores, cpus could be a single digit or a group.
> > > '(' and ')' are necessary if it's a group.
> > > If not supply '@cpus', the value of cpus uses the same as lcores.
> > >
> > > e.g. '1,2@(5-7),(3-5)@(0,2),(0,6)' means starting 7 EAL thread as below
> > >   lcore 0 runs on cpuset 0x41 (cpu 0,6)
> > >   lcore 1 runs on cpuset 0x2 (cpu 1)
> > >   lcore 2 runs on cpuset 0xe0 (cpu 5,6,7)
> > >   lcore 3,4,5 runs on cpuset 0x5 (cpu 0,2)
> > >   lcore 6 runs on cpuset 0x41 (cpu 0,6)
> > >
> > 
> > This strikes me as very confusing, though a couple of tweaks might help with
> > readability. The lcore 0 at the end is especially confusing.
> 
> Didn't get you here: do you find (0,6) confusing, right?
> Because braces implicitly specifies affinity for group of en-braced lcores? 
> 
> > Perhaps we can
> > limit the allowed formats here,
> > * require the lcore_id to be specified - the lack of an lcore id for the last part
> > makes having it as lcore 0 surprising.
> 
> Again, not sure I understand you properly:  lcore_id(s) are always specified explicitly. 
> Physical cpus part might be omitted.
> 
> > * only allow one lcore id to be given for each set of cores.
> 
> So you mean for '(3-5)@(0,2)' user would have to: '3@(0,2),4@(0,2),5@(0,2)'?
> I don't see big difference here, but imagine you'd like to create a pool of 32 EAL-threads running on same cpu set.
> With current syntax it is just something like: '(32-63)@(0-7)'.
> With what you proposing it will be a very long list.  
> 
> > 
> > I think it may still be readable if we allow the core set to be omitted if its
> > to be the same as the lcore_id.
> 
> I think that is supported.
> See lcore_id=1 in Steve's example above.
> As I understand: --lcores='0,2,3-5' is equal to '-l 0,2,3-5' and to '-c 0x3d'.
> 
> Konstantin

Ok, thanks for the clarification.

/Bruce

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v1 02/15] eal: new eal option '--lcores' for cpu assignment
  2015-01-22 12:19     ` Bruce Richardson
  2015-01-22 14:34       ` Ananyev, Konstantin
@ 2015-01-23  0:39       ` Liang, Cunming
  1 sibling, 0 replies; 253+ messages in thread
From: Liang, Cunming @ 2015-01-23  0:39 UTC (permalink / raw)
  To: Richardson, Bruce; +Cc: dev



> -----Original Message-----
> From: Richardson, Bruce
> Sent: Thursday, January 22, 2015 8:19 PM
> To: Liang, Cunming
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v1 02/15] eal: new eal option '--lcores' for cpu
> assignment
> 
> On Thu, Jan 22, 2015 at 04:16:25PM +0800, Cunming Liang wrote:
> > It supports one new eal long option '--lcores' for EAL thread cpuset assignment.
> >
> > The format pattern:
> > 	--lcores='lcores[@cpus]<,lcores[@cpus]>'
> > lcores, cpus could be a single digit or a group.
> > '(' and ')' are necessary if it's a group.
> > If not supply '@cpus', the value of cpus uses the same as lcores.
> >
> > e.g. '1,2@(5-7),(3-5)@(0,2),(0,6)' means starting 7 EAL thread as below
> >   lcore 0 runs on cpuset 0x41 (cpu 0,6)
> >   lcore 1 runs on cpuset 0x2 (cpu 1)
> >   lcore 2 runs on cpuset 0xe0 (cpu 5,6,7)
> >   lcore 3,4,5 runs on cpuset 0x5 (cpu 0,2)
> >   lcore 6 runs on cpuset 0x41 (cpu 0,6)
> >
> 
> This strikes me as very confusing, though a couple of tweaks might help with
> readability. The lcore 0 at the end is especially confusing. Perhaps we can
> limit the allowed formats here,
> * require the lcore_id to be specified - the lack of an lcore id for the last part
> makes having it as lcore 0 surprising.
> * only allow one lcore id to be given for each set of cores.
[Liang, Cunming] The last one lcore_set (0,6) without cpuset assigned is equal to '(0,6)@(0,6)' or '0@(0,6), 6@(0,6)'.
It's not a typical use case but gives an aggressive sample, it shows the simple way to explain the map.
> 
> I think it may still be readable if we allow the core set to be omitted if its
> to be the same as the lcore_id.
> 
> It's probably still not going to be very tidy, but I think we can improve things.
> 
> /Bruce
> 
> > Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> > ---
> >  lib/librte_eal/common/eal_common_launch.c  |   1 -
> >  lib/librte_eal/common/eal_common_options.c | 262
> ++++++++++++++++++++++++++++-
> >  lib/librte_eal/common/eal_options.h        |   2 +
> >  lib/librte_eal/linuxapp/eal/Makefile       |   1 +
> >  4 files changed, 261 insertions(+), 5 deletions(-)
> >
> > diff --git a/lib/librte_eal/common/eal_common_launch.c
> b/lib/librte_eal/common/eal_common_launch.c
> > index 599f83b..2d732b1 100644
> > --- a/lib/librte_eal/common/eal_common_launch.c
> > +++ b/lib/librte_eal/common/eal_common_launch.c
> > @@ -117,4 +117,3 @@ rte_eal_mp_wait_lcore(void)
> >  		rte_eal_wait_lcore(lcore_id);
> >  	}
> >  }
> > -
> > diff --git a/lib/librte_eal/common/eal_common_options.c
> b/lib/librte_eal/common/eal_common_options.c
> > index e2810ab..fc47588 100644
> > --- a/lib/librte_eal/common/eal_common_options.c
> > +++ b/lib/librte_eal/common/eal_common_options.c
> > @@ -45,6 +45,7 @@
> >  #include <rte_lcore.h>
> >  #include <rte_version.h>
> >  #include <rte_devargs.h>
> > +#include <rte_memcpy.h>
> >
> >  #include "eal_internal_cfg.h"
> >  #include "eal_options.h"
> > @@ -85,6 +86,7 @@ eal_long_options[] = {
> >  	{OPT_XEN_DOM0, 0, 0, OPT_XEN_DOM0_NUM},
> >  	{OPT_CREATE_UIO_DEV, 1, NULL, OPT_CREATE_UIO_DEV_NUM},
> >  	{OPT_VFIO_INTR, 1, NULL, OPT_VFIO_INTR_NUM},
> > +	{OPT_LCORES, 1, 0, OPT_LCORES_NUM},
> >  	{0, 0, 0, 0}
> >  };
> >
> > @@ -255,9 +257,11 @@ eal_parse_corelist(const char *corelist)
> >  			if (min == RTE_MAX_LCORE)
> >  				min = idx;
> >  			for (idx = min; idx <= max; idx++) {
> > -				cfg->lcore_role[idx] = ROLE_RTE;
> > -				lcore_config[idx].core_index = count;
> > -				count++;
> > +				if (cfg->lcore_role[idx] != ROLE_RTE) {
> > +					cfg->lcore_role[idx] = ROLE_RTE;
> > +					lcore_config[idx].core_index = count;
> > +					count++;
> > +				}
> >  			}
> >  			min = RTE_MAX_LCORE;
> >  		} else
> > @@ -289,6 +293,241 @@ eal_parse_master_lcore(const char *arg)
> >  	return 0;
> >  }
> >
> > +/*
> > + * Parse elem, the elem could be single number or '(' ')' group
> > + * Within group elem, '-' used for a range seperator;
> > + *                    ',' used for a single number.
> > + */
> > +static int
> > +eal_parse_set(const char *input, uint16_t set[], unsigned num)
> > +{
> > +	unsigned idx;
> > +	const char *str = input;
> > +	char *end = NULL;
> > +	unsigned min, max;
> > +
> > +	memset(set, 0, num * sizeof(uint16_t));
> > +
> > +	while (isblank(*str))
> > +		str++;
> > +
> > +	/* only digit or left bracket is qulify for start point */
> > +	if ((!isdigit(*str) && *str != '(') || *str == '\0')
> > +		return -1;
> > +
> > +	/* process single number */
> > +	if (*str != '(') {
> > +		errno = 0;
> > +		idx = strtoul(str, &end, 10);
> > +		if (errno || end == NULL || idx >= num)
> > +			return -1;
> > +		else {
> > +			while (isblank(*end))
> > +				end++;
> > +
> > +			if (*end != ',' && *end != '\0' &&
> > +			    *end != '@')
> > +				return -1;
> > +
> > +			set[idx] = 1;
> > +			return end - input;
> > +		}
> > +	}
> > +
> > +	/* process set within bracket */
> > +	str++;
> > +	while (isblank(*str))
> > +		str++;
> > +	if (*str == '\0')
> > +		return -1;
> > +
> > +	min = RTE_MAX_LCORE;
> > +	do {
> > +
> > +		/* go ahead to the first digit */
> > +		while (isblank(*str))
> > +			str++;
> > +		if (!isdigit(*str))
> > +			return -1;
> > +
> > +		/* get the digit value */
> > +		errno = 0;
> > +		idx = strtoul(str, &end, 10);
> > +		if (errno || end == NULL || idx >= num)
> > +			return -1;
> > +
> > +		/* go ahead to separator '-',',' and ')' */
> > +		while (isblank(*end))
> > +			end++;
> > +		if (*end == '-') {
> > +			if (min == RTE_MAX_LCORE)
> > +				min = idx;
> > +			else /* avoid continuous '-' */
> > +				return -1;
> > +		} else if ((*end == ',') || (*end == ')')) {
> > +			max = idx;
> > +			if (min == RTE_MAX_LCORE)
> > +				min = idx;
> > +			for (idx = RTE_MIN(min, max);
> > +			     idx <= RTE_MAX(min, max); idx++)
> > +				set[idx] = 1;
> > +
> > +			min = RTE_MAX_LCORE;
> > +		} else
> > +			return -1;
> > +
> > +		str = end + 1;
> > +	} while (*end != '\0' && *end != ')');
> > +
> > +	return str - input;
> > +}
> > +
> > +/* convert from set array to cpuset bitmap */
> > +static inline int
> > +convert_to_cpuset(rte_cpuset_t *cpusetp,
> > +	      uint16_t *set, unsigned num)
> > +{
> > +	unsigned idx;
> > +
> > +	CPU_ZERO(cpusetp);
> > +
> > +	for (idx = 0; idx < num; idx++) {
> > +		if (!set[idx])
> > +			continue;
> > +
> > +		if (!lcore_config[idx].detected) {
> > +			RTE_LOG(ERR, EAL, "core %u "
> > +				"unavailable\n", idx);
> > +			return -1;
> > +		}
> > +
> > +		CPU_SET(idx, cpusetp);
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +/*
> > + * The format pattern: --lcores='lcores[@cpus]<,lcores[@cpus]>'
> > + * lcores, cpus could be a single digit or a group.
> > + * '(' and ')' are necessary if it's a group.
> > + * If not supply '@cpus', the value of cpus uses the same as lcores.
> > + * e.g. '1,2@(5-7),(3-5)@(0,2),(0,6)' means start 7 EAL thread as below
> > + *   lcore 0 runs on cpuset 0x41 (cpu 0,6)
> > + *   lcore 1 runs on cpuset 0x2 (cpu 1)
> > + *   lcore 2 runs on cpuset 0xe0 (cpu 5,6,7)
> > + *   lcore 3,4,5 runs on cpuset 0x5 (cpu 0,2)
> > + *   lcore 6 runs on cpuset 0x41 (cpu 0,6)
> > + */
> > +static int
> > +eal_parse_lcores(const char *lcores)
> > +{
> > +	struct rte_config *cfg = rte_eal_get_configuration();
> > +	static uint16_t set[RTE_MAX_LCORE];
> > +	unsigned idx = 0;
> > +	int i;
> > +	unsigned count = 0;
> > +	const char *lcore_start = NULL;
> > +	const char *end = NULL;
> > +	int offset;
> > +	rte_cpuset_t cpuset;
> > +	int ret = -1;
> > +
> > +	if (lcores == NULL)
> > +		return -1;
> > +
> > +	/* Remove all blank characters ahead and after */
> > +	while (isblank(*lcores))
> > +		lcores++;
> > +	i = strnlen(lcores, sysconf(_SC_ARG_MAX));
> > +	while ((i > 0) && isblank(lcores[i - 1]))
> > +		i--;
> > +
> > +	CPU_ZERO(&cpuset);
> > +
> > +	/* Reset lcore config */
> > +	for (idx = 0; idx < RTE_MAX_LCORE; idx++) {
> > +		cfg->lcore_role[idx] = ROLE_OFF;
> > +		lcore_config[idx].core_index = -1;
> > +		CPU_ZERO(&lcore_config[idx].cpuset);
> > +	}
> > +
> > +	/* Get list of cores */
> > +	do {
> > +		while (isblank(*lcores))
> > +			lcores++;
> > +		if (*lcores == '\0')
> > +			goto err;
> > +
> > +		/* record lcore_set start point */
> > +		lcore_start = lcores;
> > +
> > +		/* go across a complete bracket */
> > +		if (*lcore_start == '(') {
> > +			lcores += strcspn(lcores, ")");
> > +			if (*lcores++ == '\0')
> > +				goto err;
> > +		}
> > +
> > +		/* scan the separator '@', ','(next) or '\0'(finish) */
> > +		lcores += strcspn(lcores, "@,");
> > +
> > +		if (*lcores == '@') {
> > +			/* explict assign cpu_set */
> > +			offset = eal_parse_set(lcores + 1, set, RTE_DIM(set));
> > +			if (offset < 0)
> > +				goto err;
> > +
> > +			/* prepare cpu_set and update the end cursor */
> > +			if (0 > convert_to_cpuset(&cpuset,
> > +						  set, RTE_DIM(set)))
> > +				goto err;
> > +			end = lcores + 1 + offset;
> > +		} else  /* ',' or '\0' */
> > +			/* haven't given cpu_set, current loop done */
> > +			end = lcores;
> > +
> > +		if (*end != ',' && *end != '\0')
> > +			goto err;
> > +
> > +		/* parse lcore_set from start point */
> > +		if (0 > eal_parse_set(lcore_start, set, RTE_DIM(set)))
> > +			goto err;
> > +
> > +		/* without '@', by default using lcore_set as cpu_set */
> > +		if (*lcores != '@' &&
> > +		    0 > convert_to_cpuset(&cpuset, set, RTE_DIM(set)))
> > +			goto err;
> > +
> > +		/* start to update lcore_set */
> > +		for (idx = 0; idx < RTE_MAX_LCORE; idx++) {
> > +			if (!set[idx])
> > +				continue;
> > +
> > +			if (cfg->lcore_role[idx] != ROLE_RTE) {
> > +				lcore_config[idx].core_index = count;
> > +				cfg->lcore_role[idx] = ROLE_RTE;
> > +				count++;
> > +			}
> > +			rte_memcpy(&lcore_config[idx].cpuset, &cpuset,
> > +				   sizeof(rte_cpuset_t));
> > +		}
> > +
> > +		lcores = end + 1;
> > +	} while (*end != '\0');
> > +
> > +	if (count == 0)
> > +		goto err;
> > +
> > +	cfg->lcore_count = count;
> > +	lcores_parsed = 1;
> > +	ret = 0;
> > +
> > +err:
> > +
> > +	return ret;
> > +}
> > +
> >  static int
> >  eal_parse_syslog(const char *facility, struct internal_config *conf)
> >  {
> > @@ -489,6 +728,13 @@ eal_parse_common_option(int opt, const char
> *optarg,
> >  		conf->log_level = log;
> >  		break;
> >  	}
> > +	case OPT_LCORES_NUM:
> > +		if (eal_parse_lcores(optarg) < 0) {
> > +			RTE_LOG(ERR, EAL, "invalid parameter for --"
> > +				OPT_LCORES "\n");
> > +			return -1;
> > +		}
> > +		break;
> >
> >  	/* don't know what to do, leave this to caller */
> >  	default:
> > @@ -527,7 +773,7 @@ eal_check_common_options(struct internal_config
> *internal_cfg)
> >
> >  	if (!lcores_parsed) {
> >  		RTE_LOG(ERR, EAL, "CPU cores must be enabled with options "
> > -			"-c or -l\n");
> > +			"-c, -l or --lcores\n");
> >  		return -1;
> >  	}
> >  	if (cfg->lcore_role[cfg->master_lcore] != ROLE_RTE) {
> > @@ -583,6 +829,14 @@ eal_common_usage(void)
> >  	       "                 The argument format is <c1>[-c2][,c3[-c4],...]\n"
> >  	       "                 where c1, c2, etc are core indexes between 0
> and %d\n"
> >  	       "  --"OPT_MASTER_LCORE" ID: Core ID that is used as master\n"
> > +	       "  --"OPT_LCORES" MAP: maps between lcore_set to
> phys_cpu_set\n"
> > +	       "                 The argument format is\n"
> > +	       "                       'lcores[@cpus]<,lcores[@cpus],...>'\n"
> > +	       "                 lcores and cpus list are grouped by '(' and ')'\n"
> > +	       "                 Within the group, '-' is used for range
> separator,\n"
> > +	       "                 ',' is used for single number separator.\n"
> > +	       "                 '( )' can be omitted for single element group,
> '@' \n"
> > +	       "                 can be omitted if cpus and lcores has the same
> value\n"
> >  	       "  -n NUM       : Number of memory channels\n"
> >  	       "  -v           : Display version information on startup\n"
> >  	       "  -m MB        : memory to allocate (see also
> --"OPT_SOCKET_MEM")\n"
> > diff --git a/lib/librte_eal/common/eal_options.h
> b/lib/librte_eal/common/eal_options.h
> > index e476f8d..a1cc59f 100644
> > --- a/lib/librte_eal/common/eal_options.h
> > +++ b/lib/librte_eal/common/eal_options.h
> > @@ -77,6 +77,8 @@ enum {
> >  	OPT_CREATE_UIO_DEV_NUM,
> >  #define OPT_VFIO_INTR    "vfio-intr"
> >  	OPT_VFIO_INTR_NUM,
> > +#define OPT_LCORES "lcores"
> > +	OPT_LCORES_NUM,
> >  	OPT_LONG_MAX_NUM
> >  };
> >
> > diff --git a/lib/librte_eal/linuxapp/eal/Makefile
> b/lib/librte_eal/linuxapp/eal/Makefile
> > index 0e9c447..025d836 100644
> > --- a/lib/librte_eal/linuxapp/eal/Makefile
> > +++ b/lib/librte_eal/linuxapp/eal/Makefile
> > @@ -95,6 +95,7 @@ CFLAGS_eal_hugepage_info.o := -D_GNU_SOURCE
> >  CFLAGS_eal_pci.o := -D_GNU_SOURCE
> >  CFLAGS_eal_pci_vfio.o := -D_GNU_SOURCE
> >  CFLAGS_eal_common_whitelist.o := -D_GNU_SOURCE
> > +CFLAGS_eal_common_options.o := -D_GNU_SOURCE
> >
> >  # workaround for a gcc bug with noreturn attribute
> >  # http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
> > --
> > 1.8.1.4
> >

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v1 02/15] eal: new eal option '--lcores' for cpu assignment
  2015-01-22 15:17         ` Wodkowski, PawelX
@ 2015-01-25 15:34           ` Liang, Cunming
  0 siblings, 0 replies; 253+ messages in thread
From: Liang, Cunming @ 2015-01-25 15:34 UTC (permalink / raw)
  To: Wodkowski, PawelX; +Cc: dev

Hi Pawel,

I don't see much different there.
If replacing '@' to '.'; '()' to '[]'; and ',' to '/'; they're almost the same.
Without having rx/tx case, so ':' is useless in our case.
Considering the semantic, '@'(at) is more readable than '.' for core assignment.

-Liang Cunming

> -----Original Message-----
> From: Wodkowski, PawelX
> Sent: Thursday, January 22, 2015 11:17 PM
> To: Ananyev, Konstantin; Richardson, Bruce; Liang, Cunming
> Cc: dev@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v1 02/15] eal: new eal option '--lcores' for cpu
> assignment
> 
> Hi,
> I want to mention that similar but for me much more readable syntax have
> Pktgen-DPDK for defining core - port mapping. Maybe we can adopt this syntax
> for new '--lcores' parameter.
> 
> See '-m' parameter syntax on Pktgen readme.
> https://github.com/pktgen/Pktgen-DPDK/blob/master/dpdk/examples/pktgen/R
> EADME.md
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ananyev, Konstantin
> > Sent: Thursday, January 22, 2015 3:34 PM
> > To: Richardson, Bruce; Liang, Cunming
> > Cc: dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v1 02/15] eal: new eal option '--lcores' for cpu
> > assignment
> >
> > Hi Bruce,
> >
> > > -----Original Message-----
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Bruce Richardson
> > > Sent: Thursday, January 22, 2015 12:19 PM
> > > To: Liang, Cunming
> > > Cc: dev@dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH v1 02/15] eal: new eal option '--lcores' for cpu
> > assignment
> > >
> > > On Thu, Jan 22, 2015 at 04:16:25PM +0800, Cunming Liang wrote:
> > > > It supports one new eal long option '--lcores' for EAL thread cpuset
> > assignment.
> > > >
> > > > The format pattern:
> > > > 	--lcores='lcores[@cpus]<,lcores[@cpus]>'
> > > > lcores, cpus could be a single digit or a group.
> > > > '(' and ')' are necessary if it's a group.
> > > > If not supply '@cpus', the value of cpus uses the same as lcores.
> > > >
> > > > e.g. '1,2@(5-7),(3-5)@(0,2),(0,6)' means starting 7 EAL thread as below
> > > >   lcore 0 runs on cpuset 0x41 (cpu 0,6)
> > > >   lcore 1 runs on cpuset 0x2 (cpu 1)
> > > >   lcore 2 runs on cpuset 0xe0 (cpu 5,6,7)
> > > >   lcore 3,4,5 runs on cpuset 0x5 (cpu 0,2)
> > > >   lcore 6 runs on cpuset 0x41 (cpu 0,6)
> > > >
> > >
> > > This strikes me as very confusing, though a couple of tweaks might help with
> > > readability. The lcore 0 at the end is especially confusing.
> >
> > Didn't get you here: do you find (0,6) confusing, right?
> > Because braces implicitly specifies affinity for group of en-braced lcores?
> >
> > > Perhaps we can
> > > limit the allowed formats here,
> > > * require the lcore_id to be specified - the lack of an lcore id for the last part
> > > makes having it as lcore 0 surprising.
> >
> > Again, not sure I understand you properly:  lcore_id(s) are always specified
> > explicitly.
> > Physical cpus part might be omitted.
> >
> > > * only allow one lcore id to be given for each set of cores.
> >
> > So you mean for '(3-5)@(0,2)' user would have to: '3@(0,2),4@(0,2),5@(0,2)'?
> > I don't see big difference here, but imagine you'd like to create a pool of 32 EAL-
> > threads running on same cpu set.
> > With current syntax it is just something like: '(32-63)@(0-7)'.
> > With what you proposing it will be a very long list.
> >
> > >
> > > I think it may still be readable if we allow the core set to be omitted if its
> > > to be the same as the lcore_id.
> >
> > I think that is supported.
> > See lcore_id=1 in Steve's example above.
> > As I understand: --lcores='0,2,3-5' is equal to '-l 0,2,3-5' and to '-c 0x3d'.
> >
> > Konstantin
> >
> > >
> > > It's probably still not going to be very tidy, but I think we can improve things.
> > >
> > > /Bruce
> > >
> > > > Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> > > > ---
> > > >  lib/librte_eal/common/eal_common_launch.c  |   1 -
> > > >  lib/librte_eal/common/eal_common_options.c | 262
> > ++++++++++++++++++++++++++++-
> > > >  lib/librte_eal/common/eal_options.h        |   2 +
> > > >  lib/librte_eal/linuxapp/eal/Makefile       |   1 +
> > > >  4 files changed, 261 insertions(+), 5 deletions(-)
> > > >
> > > > diff --git a/lib/librte_eal/common/eal_common_launch.c
> > b/lib/librte_eal/common/eal_common_launch.c
> > > > index 599f83b..2d732b1 100644
> > > > --- a/lib/librte_eal/common/eal_common_launch.c
> > > > +++ b/lib/librte_eal/common/eal_common_launch.c
> > > > @@ -117,4 +117,3 @@ rte_eal_mp_wait_lcore(void)
> > > >  		rte_eal_wait_lcore(lcore_id);
> > > >  	}
> > > >  }
> > > > -
> > > > diff --git a/lib/librte_eal/common/eal_common_options.c
> > b/lib/librte_eal/common/eal_common_options.c
> > > > index e2810ab..fc47588 100644
> > > > --- a/lib/librte_eal/common/eal_common_options.c
> > > > +++ b/lib/librte_eal/common/eal_common_options.c
> > > > @@ -45,6 +45,7 @@
> > > >  #include <rte_lcore.h>
> > > >  #include <rte_version.h>
> > > >  #include <rte_devargs.h>
> > > > +#include <rte_memcpy.h>
> > > >
> > > >  #include "eal_internal_cfg.h"
> > > >  #include "eal_options.h"
> > > > @@ -85,6 +86,7 @@ eal_long_options[] = {
> > > >  	{OPT_XEN_DOM0, 0, 0, OPT_XEN_DOM0_NUM},
> > > >  	{OPT_CREATE_UIO_DEV, 1, NULL, OPT_CREATE_UIO_DEV_NUM},
> > > >  	{OPT_VFIO_INTR, 1, NULL, OPT_VFIO_INTR_NUM},
> > > > +	{OPT_LCORES, 1, 0, OPT_LCORES_NUM},
> > > >  	{0, 0, 0, 0}
> > > >  };
> > > >
> > > > @@ -255,9 +257,11 @@ eal_parse_corelist(const char *corelist)
> > > >  			if (min == RTE_MAX_LCORE)
> > > >  				min = idx;
> > > >  			for (idx = min; idx <= max; idx++) {
> > > > -				cfg->lcore_role[idx] = ROLE_RTE;
> > > > -				lcore_config[idx].core_index = count;
> > > > -				count++;
> > > > +				if (cfg->lcore_role[idx] != ROLE_RTE) {
> > > > +					cfg->lcore_role[idx] = ROLE_RTE;
> > > > +					lcore_config[idx].core_index = count;
> > > > +					count++;
> > > > +				}
> > > >  			}
> > > >  			min = RTE_MAX_LCORE;
> > > >  		} else
> > > > @@ -289,6 +293,241 @@ eal_parse_master_lcore(const char *arg)
> > > >  	return 0;
> > > >  }
> > > >
> > > > +/*
> > > > + * Parse elem, the elem could be single number or '(' ')' group
> > > > + * Within group elem, '-' used for a range seperator;
> > > > + *                    ',' used for a single number.
> > > > + */
> > > > +static int
> > > > +eal_parse_set(const char *input, uint16_t set[], unsigned num)
> > > > +{
> > > > +	unsigned idx;
> > > > +	const char *str = input;
> > > > +	char *end = NULL;
> > > > +	unsigned min, max;
> > > > +
> > > > +	memset(set, 0, num * sizeof(uint16_t));
> > > > +
> > > > +	while (isblank(*str))
> > > > +		str++;
> > > > +
> > > > +	/* only digit or left bracket is qulify for start point */
> > > > +	if ((!isdigit(*str) && *str != '(') || *str == '\0')
> > > > +		return -1;
> > > > +
> > > > +	/* process single number */
> > > > +	if (*str != '(') {
> > > > +		errno = 0;
> > > > +		idx = strtoul(str, &end, 10);
> > > > +		if (errno || end == NULL || idx >= num)
> > > > +			return -1;
> > > > +		else {
> > > > +			while (isblank(*end))
> > > > +				end++;
> > > > +
> > > > +			if (*end != ',' && *end != '\0' &&
> > > > +			    *end != '@')
> > > > +				return -1;
> > > > +
> > > > +			set[idx] = 1;
> > > > +			return end - input;
> > > > +		}
> > > > +	}
> > > > +
> > > > +	/* process set within bracket */
> > > > +	str++;
> > > > +	while (isblank(*str))
> > > > +		str++;
> > > > +	if (*str == '\0')
> > > > +		return -1;
> > > > +
> > > > +	min = RTE_MAX_LCORE;
> > > > +	do {
> > > > +
> > > > +		/* go ahead to the first digit */
> > > > +		while (isblank(*str))
> > > > +			str++;
> > > > +		if (!isdigit(*str))
> > > > +			return -1;
> > > > +
> > > > +		/* get the digit value */
> > > > +		errno = 0;
> > > > +		idx = strtoul(str, &end, 10);
> > > > +		if (errno || end == NULL || idx >= num)
> > > > +			return -1;
> > > > +
> > > > +		/* go ahead to separator '-',',' and ')' */
> > > > +		while (isblank(*end))
> > > > +			end++;
> > > > +		if (*end == '-') {
> > > > +			if (min == RTE_MAX_LCORE)
> > > > +				min = idx;
> > > > +			else /* avoid continuous '-' */
> > > > +				return -1;
> > > > +		} else if ((*end == ',') || (*end == ')')) {
> > > > +			max = idx;
> > > > +			if (min == RTE_MAX_LCORE)
> > > > +				min = idx;
> > > > +			for (idx = RTE_MIN(min, max);
> > > > +			     idx <= RTE_MAX(min, max); idx++)
> > > > +				set[idx] = 1;
> > > > +
> > > > +			min = RTE_MAX_LCORE;
> > > > +		} else
> > > > +			return -1;
> > > > +
> > > > +		str = end + 1;
> > > > +	} while (*end != '\0' && *end != ')');
> > > > +
> > > > +	return str - input;
> > > > +}
> > > > +
> > > > +/* convert from set array to cpuset bitmap */
> > > > +static inline int
> > > > +convert_to_cpuset(rte_cpuset_t *cpusetp,
> > > > +	      uint16_t *set, unsigned num)
> > > > +{
> > > > +	unsigned idx;
> > > > +
> > > > +	CPU_ZERO(cpusetp);
> > > > +
> > > > +	for (idx = 0; idx < num; idx++) {
> > > > +		if (!set[idx])
> > > > +			continue;
> > > > +
> > > > +		if (!lcore_config[idx].detected) {
> > > > +			RTE_LOG(ERR, EAL, "core %u "
> > > > +				"unavailable\n", idx);
> > > > +			return -1;
> > > > +		}
> > > > +
> > > > +		CPU_SET(idx, cpusetp);
> > > > +	}
> > > > +
> > > > +	return 0;
> > > > +}
> > > > +
> > > > +/*
> > > > + * The format pattern: --lcores='lcores[@cpus]<,lcores[@cpus]>'
> > > > + * lcores, cpus could be a single digit or a group.
> > > > + * '(' and ')' are necessary if it's a group.
> > > > + * If not supply '@cpus', the value of cpus uses the same as lcores.
> > > > + * e.g. '1,2@(5-7),(3-5)@(0,2),(0,6)' means start 7 EAL thread as below
> > > > + *   lcore 0 runs on cpuset 0x41 (cpu 0,6)
> > > > + *   lcore 1 runs on cpuset 0x2 (cpu 1)
> > > > + *   lcore 2 runs on cpuset 0xe0 (cpu 5,6,7)
> > > > + *   lcore 3,4,5 runs on cpuset 0x5 (cpu 0,2)
> > > > + *   lcore 6 runs on cpuset 0x41 (cpu 0,6)
> > > > + */
> > > > +static int
> > > > +eal_parse_lcores(const char *lcores)
> > > > +{
> > > > +	struct rte_config *cfg = rte_eal_get_configuration();
> > > > +	static uint16_t set[RTE_MAX_LCORE];
> > > > +	unsigned idx = 0;
> > > > +	int i;
> > > > +	unsigned count = 0;
> > > > +	const char *lcore_start = NULL;
> > > > +	const char *end = NULL;
> > > > +	int offset;
> > > > +	rte_cpuset_t cpuset;
> > > > +	int ret = -1;
> > > > +
> > > > +	if (lcores == NULL)
> > > > +		return -1;
> > > > +
> > > > +	/* Remove all blank characters ahead and after */
> > > > +	while (isblank(*lcores))
> > > > +		lcores++;
> > > > +	i = strnlen(lcores, sysconf(_SC_ARG_MAX));
> > > > +	while ((i > 0) && isblank(lcores[i - 1]))
> > > > +		i--;
> > > > +
> > > > +	CPU_ZERO(&cpuset);
> > > > +
> > > > +	/* Reset lcore config */
> > > > +	for (idx = 0; idx < RTE_MAX_LCORE; idx++) {
> > > > +		cfg->lcore_role[idx] = ROLE_OFF;
> > > > +		lcore_config[idx].core_index = -1;
> > > > +		CPU_ZERO(&lcore_config[idx].cpuset);
> > > > +	}
> > > > +
> > > > +	/* Get list of cores */
> > > > +	do {
> > > > +		while (isblank(*lcores))
> > > > +			lcores++;
> > > > +		if (*lcores == '\0')
> > > > +			goto err;
> > > > +
> > > > +		/* record lcore_set start point */
> > > > +		lcore_start = lcores;
> > > > +
> > > > +		/* go across a complete bracket */
> > > > +		if (*lcore_start == '(') {
> > > > +			lcores += strcspn(lcores, ")");
> > > > +			if (*lcores++ == '\0')
> > > > +				goto err;
> > > > +		}
> > > > +
> > > > +		/* scan the separator '@', ','(next) or '\0'(finish) */
> > > > +		lcores += strcspn(lcores, "@,");
> > > > +
> > > > +		if (*lcores == '@') {
> > > > +			/* explict assign cpu_set */
> > > > +			offset = eal_parse_set(lcores + 1, set, RTE_DIM(set));
> > > > +			if (offset < 0)
> > > > +				goto err;
> > > > +
> > > > +			/* prepare cpu_set and update the end cursor */
> > > > +			if (0 > convert_to_cpuset(&cpuset,
> > > > +						  set, RTE_DIM(set)))
> > > > +				goto err;
> > > > +			end = lcores + 1 + offset;
> > > > +		} else  /* ',' or '\0' */
> > > > +			/* haven't given cpu_set, current loop done */
> > > > +			end = lcores;
> > > > +
> > > > +		if (*end != ',' && *end != '\0')
> > > > +			goto err;
> > > > +
> > > > +		/* parse lcore_set from start point */
> > > > +		if (0 > eal_parse_set(lcore_start, set, RTE_DIM(set)))
> > > > +			goto err;
> > > > +
> > > > +		/* without '@', by default using lcore_set as cpu_set */
> > > > +		if (*lcores != '@' &&
> > > > +		    0 > convert_to_cpuset(&cpuset, set, RTE_DIM(set)))
> > > > +			goto err;
> > > > +
> > > > +		/* start to update lcore_set */
> > > > +		for (idx = 0; idx < RTE_MAX_LCORE; idx++) {
> > > > +			if (!set[idx])
> > > > +				continue;
> > > > +
> > > > +			if (cfg->lcore_role[idx] != ROLE_RTE) {
> > > > +				lcore_config[idx].core_index = count;
> > > > +				cfg->lcore_role[idx] = ROLE_RTE;
> > > > +				count++;
> > > > +			}
> > > > +			rte_memcpy(&lcore_config[idx].cpuset, &cpuset,
> > > > +				   sizeof(rte_cpuset_t));
> > > > +		}
> > > > +
> > > > +		lcores = end + 1;
> > > > +	} while (*end != '\0');
> > > > +
> > > > +	if (count == 0)
> > > > +		goto err;
> > > > +
> > > > +	cfg->lcore_count = count;
> > > > +	lcores_parsed = 1;
> > > > +	ret = 0;
> > > > +
> > > > +err:
> > > > +
> > > > +	return ret;
> > > > +}
> > > > +
> > > >  static int
> > > >  eal_parse_syslog(const char *facility, struct internal_config *conf)
> > > >  {
> > > > @@ -489,6 +728,13 @@ eal_parse_common_option(int opt, const char
> > *optarg,
> > > >  		conf->log_level = log;
> > > >  		break;
> > > >  	}
> > > > +	case OPT_LCORES_NUM:
> > > > +		if (eal_parse_lcores(optarg) < 0) {
> > > > +			RTE_LOG(ERR, EAL, "invalid parameter for --"
> > > > +				OPT_LCORES "\n");
> > > > +			return -1;
> > > > +		}
> > > > +		break;
> > > >
> > > >  	/* don't know what to do, leave this to caller */
> > > >  	default:
> > > > @@ -527,7 +773,7 @@ eal_check_common_options(struct internal_config
> > *internal_cfg)
> > > >
> > > >  	if (!lcores_parsed) {
> > > >  		RTE_LOG(ERR, EAL, "CPU cores must be enabled with options "
> > > > -			"-c or -l\n");
> > > > +			"-c, -l or --lcores\n");
> > > >  		return -1;
> > > >  	}
> > > >  	if (cfg->lcore_role[cfg->master_lcore] != ROLE_RTE) {
> > > > @@ -583,6 +829,14 @@ eal_common_usage(void)
> > > >  	       "                 The argument format is
> <c1>[-c2][,c3[-c4],...]\n"
> > > >  	       "                 where c1, c2, etc are core indexes between
> 0 and %d\n"
> > > >  	       "  --"OPT_MASTER_LCORE" ID: Core ID that is used as
> master\n"
> > > > +	       "  --"OPT_LCORES" MAP: maps between lcore_set to
> > phys_cpu_set\n"
> > > > +	       "                 The argument format is\n"
> > > > +	       "
> 'lcores[@cpus]<,lcores[@cpus],...>'\n"
> > > > +	       "                 lcores and cpus list are grouped by '(' and
> ')'\n"
> > > > +	       "                 Within the group, '-' is used for range
> separator,\n"
> > > > +	       "                 ',' is used for single number separator.\n"
> > > > +	       "                 '( )' can be omitted for single element
> group, '@' \n"
> > > > +	       "                 can be omitted if cpus and lcores has the
> same value\n"
> > > >  	       "  -n NUM       : Number of memory channels\n"
> > > >  	       "  -v           : Display version information on startup\n"
> > > >  	       "  -m MB        : memory to allocate (see also --
> > "OPT_SOCKET_MEM")\n"
> > > > diff --git a/lib/librte_eal/common/eal_options.h
> > b/lib/librte_eal/common/eal_options.h
> > > > index e476f8d..a1cc59f 100644
> > > > --- a/lib/librte_eal/common/eal_options.h
> > > > +++ b/lib/librte_eal/common/eal_options.h
> > > > @@ -77,6 +77,8 @@ enum {
> > > >  	OPT_CREATE_UIO_DEV_NUM,
> > > >  #define OPT_VFIO_INTR    "vfio-intr"
> > > >  	OPT_VFIO_INTR_NUM,
> > > > +#define OPT_LCORES "lcores"
> > > > +	OPT_LCORES_NUM,
> > > >  	OPT_LONG_MAX_NUM
> > > >  };
> > > >
> > > > diff --git a/lib/librte_eal/linuxapp/eal/Makefile
> > b/lib/librte_eal/linuxapp/eal/Makefile
> > > > index 0e9c447..025d836 100644
> > > > --- a/lib/librte_eal/linuxapp/eal/Makefile
> > > > +++ b/lib/librte_eal/linuxapp/eal/Makefile
> > > > @@ -95,6 +95,7 @@ CFLAGS_eal_hugepage_info.o := -D_GNU_SOURCE
> > > >  CFLAGS_eal_pci.o := -D_GNU_SOURCE
> > > >  CFLAGS_eal_pci_vfio.o := -D_GNU_SOURCE
> > > >  CFLAGS_eal_common_whitelist.o := -D_GNU_SOURCE
> > > > +CFLAGS_eal_common_options.o := -D_GNU_SOURCE
> > > >
> > > >  # workaround for a gcc bug with noreturn attribute
> > > >  # http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
> > > > --
> > > > 1.8.1.4
> > > >

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v1 09/15] malloc: fix the issue of SOCKET_ID_ANY
  2015-01-22  8:16   ` [dpdk-dev] [PATCH v1 09/15] malloc: fix the issue of SOCKET_ID_ANY Cunming Liang
@ 2015-01-25 23:04     ` Stephen Hemminger
  2015-01-27  4:55       ` Liang, Cunming
  2015-01-26 13:48     ` Stephen Hemminger
  1 sibling, 1 reply; 253+ messages in thread
From: Stephen Hemminger @ 2015-01-25 23:04 UTC (permalink / raw)
  To: Cunming Liang; +Cc: dev

On Thu, 22 Jan 2015 16:16:32 +0800
Cunming Liang <cunming.liang@intel.com> wrote:

> -	return rte_socket_id();
> +	unsigned socket_id = rte_socket_id();
> +
> +	if (socket_id == (unsigned)SOCKET_ID_ANY)

I prefer not casting -1 to unsigned it will cause warnings.
It is better to make socket_id an integer and then have
the implicit cast in the return.

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v1 09/15] malloc: fix the issue of SOCKET_ID_ANY
  2015-01-22  8:16   ` [dpdk-dev] [PATCH v1 09/15] malloc: fix the issue of SOCKET_ID_ANY Cunming Liang
  2015-01-25 23:04     ` Stephen Hemminger
@ 2015-01-26 13:48     ` Stephen Hemminger
  1 sibling, 0 replies; 253+ messages in thread
From: Stephen Hemminger @ 2015-01-26 13:48 UTC (permalink / raw)
  To: Cunming Liang; +Cc: dev

On Thu, 22 Jan 2015 16:16:32 +0800
Cunming Liang <cunming.liang@intel.com> wrote:

> -	return rte_socket_id();
> +	unsigned socket_id = rte_socket_id();
> +
> +	if (socket_id == (unsigned)SOCKET_ID_ANY)

I prefer not casting -1 to unsigned it will cause warnings.
It is better to make socket_id an integer and then have
the implicit cast in the return

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v1 09/15] malloc: fix the issue of SOCKET_ID_ANY
  2015-01-25 23:04     ` Stephen Hemminger
@ 2015-01-27  4:55       ` Liang, Cunming
  0 siblings, 0 replies; 253+ messages in thread
From: Liang, Cunming @ 2015-01-27  4:55 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev



> -----Original Message-----
> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Sunday, January 25, 2015 4:05 PM
> To: Liang, Cunming
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v1 09/15] malloc: fix the issue of SOCKET_ID_ANY
> 
> On Thu, 22 Jan 2015 16:16:32 +0800
> Cunming Liang <cunming.liang@intel.com> wrote:
> 
> > -	return rte_socket_id();
> > +	unsigned socket_id = rte_socket_id();
> > +
> > +	if (socket_id == (unsigned)SOCKET_ID_ANY)
> 
> I prefer not casting -1 to unsigned it will cause warnings.
> It is better to make socket_id an integer and then have
> the implicit cast in the return.
[Liang, Cunming] I didn't got warning about it, in which version of compiler complain it ?

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v2 00/15] support multi-pthread per core
  2015-01-22  8:16 ` [dpdk-dev] [PATCH v1 00/15] support multi-pthread per core Cunming Liang
                     ` (15 preceding siblings ...)
  2015-01-22 14:14   ` [dpdk-dev] [PATCH v1 00/15] support multi-pthread per core Ananyev, Konstantin
@ 2015-01-28  6:59   ` Cunming Liang
  2015-01-28  6:59     ` [dpdk-dev] [PATCH v2 01/15] eal: add cpuset into per EAL thread lcore_config Cunming Liang
                       ` (15 more replies)
  16 siblings, 16 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-28  6:59 UTC (permalink / raw)
  To: dev

v2 changes:
  add '<number>-<number>' support for EAL option '--lcores' 

The patch series contain the enhancements of EAL and fixes for libraries
to run multi-pthreads(either EAL or non-EAL thread) per physical core. 
Two major changes list as below:
- Extend the core affinity of each EAL thread to 1:n.
  Each lcore stands for a EAL thread rather than a logical core.
  The change adds new EAL option to allow static lcore to cpuset assginment.
  Then a lcore(EAL thread) affinity to a cpuset, original 1:1 mapping is the special case.
- Fix the libraries to allow running on any non-EAL thread.
  It fix the gaps running libraries in non-EAL thread(dynamic created by user).
  Each fix libraries take care the case of rte_lcore_id() >= RTE_MAX_LCORE.
  
Thanks a million for the comments from Konstantin, Bruce, Mirek and Stephen in RFC review.


*** BLURB HERE ***

Cunming Liang (15):
  eal: add cpuset into per EAL thread lcore_config
  eal: new eal option '--lcores' for cpu assignment
  eal: add support parsing socket_id from cpuset
  eal: new TLS definition and API declaration
  eal: add eal_common_thread.c for common thread API
  eal: add rte_gettid() to acquire unique system tid
  eal: apply affinity of EAL thread by assigned cpuset
  enic: fix re-define freebsd compile complain
  malloc: fix the issue of SOCKET_ID_ANY
  log: fix the gap to support non-EAL thread
  eal: set _lcore_id and _socket_id to (-1) by default
  eal: fix recursive spinlock in non-EAL thraed
  mempool: add support to non-EAL thread
  ring: add support to non-EAL thread
  timer: add support to non-EAL thread

 lib/librte_eal/bsdapp/eal/Makefile                 |   1 +
 lib/librte_eal/bsdapp/eal/eal.c                    |  13 +-
 lib/librte_eal/bsdapp/eal/eal_lcore.c              |  14 +
 lib/librte_eal/bsdapp/eal/eal_memory.c             |   2 +
 lib/librte_eal/bsdapp/eal/eal_thread.c             |  76 +++---
 lib/librte_eal/common/eal_common_launch.c          |   1 -
 lib/librte_eal/common/eal_common_log.c             |  17 +-
 lib/librte_eal/common/eal_common_options.c         | 300 ++++++++++++++++++++-
 lib/librte_eal/common/eal_common_thread.c          | 142 ++++++++++
 lib/librte_eal/common/eal_options.h                |   2 +
 lib/librte_eal/common/eal_thread.h                 |  66 +++++
 .../common/include/generic/rte_spinlock.h          |   4 +-
 lib/librte_eal/common/include/rte_eal.h            |  27 ++
 lib/librte_eal/common/include/rte_lcore.h          |  37 ++-
 lib/librte_eal/common/include/rte_log.h            |   5 +
 lib/librte_eal/linuxapp/eal/Makefile               |   4 +
 lib/librte_eal/linuxapp/eal/eal.c                  |   7 +-
 lib/librte_eal/linuxapp/eal/eal_lcore.c            |  15 ++
 lib/librte_eal/linuxapp/eal/eal_thread.c           |  78 +++---
 lib/librte_malloc/malloc_heap.h                    |   7 +-
 lib/librte_mempool/rte_mempool.h                   |  18 +-
 lib/librte_pmd_enic/enic.h                         |   1 +
 lib/librte_pmd_enic/enic_compat.h                  |   1 +
 lib/librte_ring/rte_ring.h                         |  10 +-
 lib/librte_timer/rte_timer.c                       |  40 ++-
 lib/librte_timer/rte_timer.h                       |   2 +-
 26 files changed, 759 insertions(+), 131 deletions(-)
 create mode 100644 lib/librte_eal/common/eal_common_thread.c

-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v2 01/15] eal: add cpuset into per EAL thread lcore_config
  2015-01-28  6:59   ` [dpdk-dev] [PATCH v2 " Cunming Liang
@ 2015-01-28  6:59     ` Cunming Liang
  2015-01-28  6:59     ` [dpdk-dev] [PATCH v2 02/15] eal: new eal option '--lcores' for cpu assignment Cunming Liang
                       ` (14 subsequent siblings)
  15 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-28  6:59 UTC (permalink / raw)
  To: dev

The patch adds 'cpuset' into per-lcore configure 'lcore_config[]',
as the lcore no longer always 1:1 pinning with physical cpu.
The lcore now stands for a EAL thread rather than a logical cpu.

It doesn't change the default behavior of 1:1 mapping, but allows to
affinity the EAL thread to multiple cpus.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/bsdapp/eal/eal_lcore.c     | 7 +++++++
 lib/librte_eal/bsdapp/eal/eal_memory.c    | 2 ++
 lib/librte_eal/common/include/rte_lcore.h | 8 ++++++++
 lib/librte_eal/linuxapp/eal/Makefile      | 1 +
 lib/librte_eal/linuxapp/eal/eal_lcore.c   | 8 ++++++++
 5 files changed, 26 insertions(+)

diff --git a/lib/librte_eal/bsdapp/eal/eal_lcore.c b/lib/librte_eal/bsdapp/eal/eal_lcore.c
index 662f024..72f8ac2 100644
--- a/lib/librte_eal/bsdapp/eal/eal_lcore.c
+++ b/lib/librte_eal/bsdapp/eal/eal_lcore.c
@@ -76,11 +76,18 @@ rte_eal_cpu_init(void)
 	 * ones and enable them by default.
 	 */
 	for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
+		/* init cpuset for per lcore config */
+		CPU_ZERO(&lcore_config[lcore_id].cpuset);
+
 		lcore_config[lcore_id].detected = (lcore_id < ncpus);
 		if (lcore_config[lcore_id].detected == 0) {
 			config->lcore_role[lcore_id] = ROLE_OFF;
 			continue;
 		}
+
+		/* By default, lcore 1:1 map to cpu id */
+		CPU_SET(lcore_id, &lcore_config[lcore_id].cpuset);
+
 		/* By default, each detected core is enabled */
 		config->lcore_role[lcore_id] = ROLE_RTE;
 		lcore_config[lcore_id].core_id = cpu_core_id(lcore_id);
diff --git a/lib/librte_eal/bsdapp/eal/eal_memory.c b/lib/librte_eal/bsdapp/eal/eal_memory.c
index 65ee87d..a34d500 100644
--- a/lib/librte_eal/bsdapp/eal/eal_memory.c
+++ b/lib/librte_eal/bsdapp/eal/eal_memory.c
@@ -45,6 +45,8 @@
 #include "eal_internal_cfg.h"
 #include "eal_filesystem.h"
 
+/* avoid re-defined against with freebsd header */
+#undef PAGE_SIZE
 #define PAGE_SIZE (sysconf(_SC_PAGESIZE))
 
 /*
diff --git a/lib/librte_eal/common/include/rte_lcore.h b/lib/librte_eal/common/include/rte_lcore.h
index 49b2c03..4c7d6bb 100644
--- a/lib/librte_eal/common/include/rte_lcore.h
+++ b/lib/librte_eal/common/include/rte_lcore.h
@@ -50,6 +50,13 @@ extern "C" {
 
 #define LCORE_ID_ANY -1    /**< Any lcore. */
 
+#if defined(__linux__)
+	typedef	cpu_set_t rte_cpuset_t;
+#elif defined(__FreeBSD__)
+#include <pthread_np.h>
+	typedef cpuset_t rte_cpuset_t;
+#endif
+
 /**
  * Structure storing internal configuration (per-lcore)
  */
@@ -65,6 +72,7 @@ struct lcore_config {
 	unsigned socket_id;        /**< physical socket id for this lcore */
 	unsigned core_id;          /**< core number on socket for this lcore */
 	int core_index;            /**< relative index, starting from 0 */
+	rte_cpuset_t cpuset;       /**< cpu set which the lcore affinity to */
 };
 
 /**
diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile
index 72ecf3a..0e9c447 100644
--- a/lib/librte_eal/linuxapp/eal/Makefile
+++ b/lib/librte_eal/linuxapp/eal/Makefile
@@ -87,6 +87,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_common_dev.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_common_options.c
 
 CFLAGS_eal.o := -D_GNU_SOURCE
+CFLAGS_eal_lcore.o := -D_GNU_SOURCE
 CFLAGS_eal_thread.o := -D_GNU_SOURCE
 CFLAGS_eal_log.o := -D_GNU_SOURCE
 CFLAGS_eal_common_log.o := -D_GNU_SOURCE
diff --git a/lib/librte_eal/linuxapp/eal/eal_lcore.c b/lib/librte_eal/linuxapp/eal/eal_lcore.c
index c67e0e6..29615f8 100644
--- a/lib/librte_eal/linuxapp/eal/eal_lcore.c
+++ b/lib/librte_eal/linuxapp/eal/eal_lcore.c
@@ -158,11 +158,19 @@ rte_eal_cpu_init(void)
 	 * ones and enable them by default.
 	 */
 	for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
+		/* init cpuset for per lcore config */
+		CPU_ZERO(&lcore_config[lcore_id].cpuset);
+
+		/* in 1:1 mapping, record related cpu detected state */
 		lcore_config[lcore_id].detected = cpu_detected(lcore_id);
 		if (lcore_config[lcore_id].detected == 0) {
 			config->lcore_role[lcore_id] = ROLE_OFF;
 			continue;
 		}
+
+		/* By default, lcore 1:1 map to cpu id */
+		CPU_SET(lcore_id, &lcore_config[lcore_id].cpuset);
+
 		/* By default, each detected core is enabled */
 		config->lcore_role[lcore_id] = ROLE_RTE;
 		lcore_config[lcore_id].core_id = cpu_core_id(lcore_id);
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v2 02/15] eal: new eal option '--lcores' for cpu assignment
  2015-01-28  6:59   ` [dpdk-dev] [PATCH v2 " Cunming Liang
  2015-01-28  6:59     ` [dpdk-dev] [PATCH v2 01/15] eal: add cpuset into per EAL thread lcore_config Cunming Liang
@ 2015-01-28  6:59     ` Cunming Liang
  2015-01-28  6:59     ` [dpdk-dev] [PATCH v2 03/15] eal: add support parsing socket_id from cpuset Cunming Liang
                       ` (13 subsequent siblings)
  15 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-28  6:59 UTC (permalink / raw)
  To: dev

It supports one new eal long option '--lcores' for EAL thread cpuset assignment.

The format pattern:
	--lcores='lcores[@cpus]<,lcores[@cpus]>'
lcores, cpus could be a single digit/range or a group.
'(' and ')' are necessary if it's a group.
If not supply '@cpus', the value of cpus uses the same as lcores.

e.g. '1,2@(5-7),(3-5)@(0,2),(0,6),7-8' means starting 9 EAL thread as below
  lcore 0 runs on cpuset 0x41 (cpu 0,6)
  lcore 1 runs on cpuset 0x2 (cpu 1)
  lcore 2 runs on cpuset 0xe0 (cpu 5,6,7)
  lcore 3,4,5 runs on cpuset 0x5 (cpu 0,2)
  lcore 6 runs on cpuset 0x41 (cpu 0,6)
  lcore 7 runs on cpuset 0x80 (cpu 7)
  lcore 8 runs on cpuset 0x100 (cpu 8)

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/common/eal_common_launch.c  |   1 -
 lib/librte_eal/common/eal_common_options.c | 300 ++++++++++++++++++++++++++++-
 lib/librte_eal/common/eal_options.h        |   2 +
 lib/librte_eal/linuxapp/eal/Makefile       |   1 +
 4 files changed, 299 insertions(+), 5 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_launch.c b/lib/librte_eal/common/eal_common_launch.c
index 599f83b..2d732b1 100644
--- a/lib/librte_eal/common/eal_common_launch.c
+++ b/lib/librte_eal/common/eal_common_launch.c
@@ -117,4 +117,3 @@ rte_eal_mp_wait_lcore(void)
 		rte_eal_wait_lcore(lcore_id);
 	}
 }
-
diff --git a/lib/librte_eal/common/eal_common_options.c b/lib/librte_eal/common/eal_common_options.c
index 67e02dc..29ebb6f 100644
--- a/lib/librte_eal/common/eal_common_options.c
+++ b/lib/librte_eal/common/eal_common_options.c
@@ -45,6 +45,7 @@
 #include <rte_lcore.h>
 #include <rte_version.h>
 #include <rte_devargs.h>
+#include <rte_memcpy.h>
 
 #include "eal_internal_cfg.h"
 #include "eal_options.h"
@@ -85,6 +86,7 @@ eal_long_options[] = {
 	{OPT_XEN_DOM0, 0, 0, OPT_XEN_DOM0_NUM},
 	{OPT_CREATE_UIO_DEV, 1, NULL, OPT_CREATE_UIO_DEV_NUM},
 	{OPT_VFIO_INTR, 1, NULL, OPT_VFIO_INTR_NUM},
+	{OPT_LCORES, 1, 0, OPT_LCORES_NUM},
 	{0, 0, 0, 0}
 };
 
@@ -255,9 +257,11 @@ eal_parse_corelist(const char *corelist)
 			if (min == RTE_MAX_LCORE)
 				min = idx;
 			for (idx = min; idx <= max; idx++) {
-				cfg->lcore_role[idx] = ROLE_RTE;
-				lcore_config[idx].core_index = count;
-				count++;
+				if (cfg->lcore_role[idx] != ROLE_RTE) {
+					cfg->lcore_role[idx] = ROLE_RTE;
+					lcore_config[idx].core_index = count;
+					count++;
+				}
 			}
 			min = RTE_MAX_LCORE;
 		} else
@@ -292,6 +296,279 @@ eal_parse_master_lcore(const char *arg)
 	return 0;
 }
 
+/*
+ * Parse elem, the elem could be single number/range or '(' ')' group
+ * Within group elem, '-' used for a range seperator;
+ *                    ',' used for a single number.
+ */
+static int
+eal_parse_set(const char *input, uint16_t set[], unsigned num)
+{
+	unsigned idx;
+	const char *str = input;
+	char *end = NULL;
+	unsigned min, max;
+
+	memset(set, 0, num * sizeof(uint16_t));
+
+	while (isblank(*str))
+		str++;
+
+	/* only digit or left bracket is qulify for start point */
+	if ((!isdigit(*str) && *str != '(') || *str == '\0')
+		return -1;
+
+	/* process single number or single range of number */
+	if (*str != '(') {
+		errno = 0;
+		idx = strtoul(str, &end, 10);
+		if (errno || end == NULL || idx >= num)
+			return -1;
+		else {
+			while (isblank(*end))
+				end++;
+
+			min = idx;
+			max = idx;
+			if (*end == '-') {
+				/* proccess single <number>-<number> */
+				end++;
+				while (isblank(*end))
+					end++;
+				if (!isdigit(*end))
+					return -1;
+
+				errno = 0;
+				idx = strtoul(end, &end, 10);
+				if (errno || end == NULL || idx >= num)
+					return -1;
+				max = idx;
+				while (isblank(*end))
+					end++;
+				if (*end != ',' && *end != '\0')
+					return -1;
+			}
+
+			if (*end != ',' && *end != '\0' &&
+			    *end != '@')
+				return -1;
+
+			for (idx = RTE_MIN(min, max);
+			     idx <= RTE_MAX(min, max); idx++)
+				set[idx] = 1;
+
+			return end - input;
+		}
+	}
+
+	/* process set within bracket */
+	str++;
+	while (isblank(*str))
+		str++;
+	if (*str == '\0')
+		return -1;
+
+	min = RTE_MAX_LCORE;
+	do {
+
+		/* go ahead to the first digit */
+		while (isblank(*str))
+			str++;
+		if (!isdigit(*str))
+			return -1;
+
+		/* get the digit value */
+		errno = 0;
+		idx = strtoul(str, &end, 10);
+		if (errno || end == NULL || idx >= num)
+			return -1;
+
+		/* go ahead to separator '-',',' and ')' */
+		while (isblank(*end))
+			end++;
+		if (*end == '-') {
+			if (min == RTE_MAX_LCORE)
+				min = idx;
+			else /* avoid continuous '-' */
+				return -1;
+		} else if ((*end == ',') || (*end == ')')) {
+			max = idx;
+			if (min == RTE_MAX_LCORE)
+				min = idx;
+			for (idx = RTE_MIN(min, max);
+			     idx <= RTE_MAX(min, max); idx++)
+				set[idx] = 1;
+
+			min = RTE_MAX_LCORE;
+		} else
+			return -1;
+
+		str = end + 1;
+	} while (*end != '\0' && *end != ')');
+
+	return str - input;
+}
+
+/* convert from set array to cpuset bitmap */
+static inline int
+convert_to_cpuset(rte_cpuset_t *cpusetp,
+	      uint16_t *set, unsigned num)
+{
+	unsigned idx;
+
+	CPU_ZERO(cpusetp);
+
+	for (idx = 0; idx < num; idx++) {
+		if (!set[idx])
+			continue;
+
+		if (!lcore_config[idx].detected) {
+			RTE_LOG(ERR, EAL, "core %u "
+				"unavailable\n", idx);
+			return -1;
+		}
+
+		CPU_SET(idx, cpusetp);
+	}
+
+	return 0;
+}
+
+/*
+ * The format pattern: --lcores='lcores[@cpus]<,lcores[@cpus]>'
+ * lcores, cpus could be a single digit/range or a group.
+ * '(' and ')' are necessary if it's a group.
+ * If not supply '@cpus', the value of cpus uses the same as lcores.
+ * e.g. '1,2@(5-7),(3-5)@(0,2),(0,6),7-8' means start 9 EAL thread as below
+ *   lcore 0 runs on cpuset 0x41 (cpu 0,6)
+ *   lcore 1 runs on cpuset 0x2 (cpu 1)
+ *   lcore 2 runs on cpuset 0xe0 (cpu 5,6,7)
+ *   lcore 3,4,5 runs on cpuset 0x5 (cpu 0,2)
+ *   lcore 6 runs on cpuset 0x41 (cpu 0,6)
+ *   lcore 7 runs on cpuset 0x80 (cpu 7)
+ *   lcore 8 runs on cpuset 0x100 (cpu 8)
+ */
+static int
+eal_parse_lcores(const char *lcores)
+{
+	struct rte_config *cfg = rte_eal_get_configuration();
+	static uint16_t set[RTE_MAX_LCORE];
+	unsigned idx = 0;
+	int i;
+	unsigned count = 0;
+	const char *lcore_start = NULL;
+	const char *end = NULL;
+	int offset;
+	rte_cpuset_t cpuset;
+	int lflags = 0;
+	int ret = -1;
+
+	if (lcores == NULL)
+		return -1;
+
+	/* Remove all blank characters ahead and after */
+	while (isblank(*lcores))
+		lcores++;
+	i = strnlen(lcores, sysconf(_SC_ARG_MAX));
+	while ((i > 0) && isblank(lcores[i - 1]))
+		i--;
+
+	CPU_ZERO(&cpuset);
+
+	/* Reset lcore config */
+	for (idx = 0; idx < RTE_MAX_LCORE; idx++) {
+		cfg->lcore_role[idx] = ROLE_OFF;
+		lcore_config[idx].core_index = -1;
+		CPU_ZERO(&lcore_config[idx].cpuset);
+	}
+
+	/* Get list of cores */
+	do {
+		while (isblank(*lcores))
+			lcores++;
+		if (*lcores == '\0')
+			goto err;
+
+		/* record lcore_set start point */
+		lcore_start = lcores;
+
+		/* go across a complete bracket */
+		if (*lcore_start == '(') {
+			lcores += strcspn(lcores, ")");
+			if (*lcores++ == '\0')
+				goto err;
+		}
+
+		/* scan the separator '@', ','(next) or '\0'(finish) */
+		lcores += strcspn(lcores, "@,");
+
+		if (*lcores == '@') {
+			/* explict assign cpu_set */
+			offset = eal_parse_set(lcores + 1, set, RTE_DIM(set));
+			if (offset < 0)
+				goto err;
+
+			/* prepare cpu_set and update the end cursor */
+			if (0 > convert_to_cpuset(&cpuset,
+						  set, RTE_DIM(set)))
+				goto err;
+			end = lcores + 1 + offset;
+		} else { /* ',' or '\0' */
+			/* haven't given cpu_set, current loop done */
+			end = lcores;
+
+			/* go back to check <number>-<number> */
+			offset = strcspn(lcore_start, "-");
+			if (offset < (end - lcore_start))
+				lflags = 1;
+		}
+
+		if (*end != ',' && *end != '\0')
+			goto err;
+
+		/* parse lcore_set from start point */
+		if (0 > eal_parse_set(lcore_start, set, RTE_DIM(set)))
+			goto err;
+
+		/* without '@', by default using lcore_set as cpu_set */
+		if (*lcores != '@' &&
+		    0 > convert_to_cpuset(&cpuset, set, RTE_DIM(set)))
+			goto err;
+
+		/* start to update lcore_set */
+		for (idx = 0; idx < RTE_MAX_LCORE; idx++) {
+			if (!set[idx])
+				continue;
+
+			if (cfg->lcore_role[idx] != ROLE_RTE) {
+				lcore_config[idx].core_index = count;
+				cfg->lcore_role[idx] = ROLE_RTE;
+				count++;
+			}
+
+			if (lflags) {
+				CPU_ZERO(&cpuset);
+				CPU_SET(idx, &cpuset);
+			}
+			rte_memcpy(&lcore_config[idx].cpuset, &cpuset,
+				   sizeof(rte_cpuset_t));
+		}
+
+		lcores = end + 1;
+	} while (*end != '\0');
+
+	if (count == 0)
+		goto err;
+
+	cfg->lcore_count = count;
+	lcores_parsed = 1;
+	ret = 0;
+
+err:
+
+	return ret;
+}
+
 static int
 eal_parse_syslog(const char *facility, struct internal_config *conf)
 {
@@ -492,6 +769,13 @@ eal_parse_common_option(int opt, const char *optarg,
 		conf->log_level = log;
 		break;
 	}
+	case OPT_LCORES_NUM:
+		if (eal_parse_lcores(optarg) < 0) {
+			RTE_LOG(ERR, EAL, "invalid parameter for --"
+				OPT_LCORES "\n");
+			return -1;
+		}
+		break;
 
 	/* don't know what to do, leave this to caller */
 	default:
@@ -530,7 +814,7 @@ eal_check_common_options(struct internal_config *internal_cfg)
 
 	if (!lcores_parsed) {
 		RTE_LOG(ERR, EAL, "CPU cores must be enabled with options "
-			"-c or -l\n");
+			"-c, -l or --lcores\n");
 		return -1;
 	}
 	if (cfg->lcore_role[cfg->master_lcore] != ROLE_RTE) {
@@ -586,6 +870,14 @@ eal_common_usage(void)
 	       "                 The argument format is <c1>[-c2][,c3[-c4],...]\n"
 	       "                 where c1, c2, etc are core indexes between 0 and %d\n"
 	       "  --"OPT_MASTER_LCORE" ID: Core ID that is used as master\n"
+	       "  --"OPT_LCORES" MAP: maps between lcore_set to phys_cpu_set\n"
+	       "                 The argument format is\n"
+	       "                       'lcores[@cpus]<,lcores[@cpus],...>'\n"
+	       "                 lcores and cpus list are grouped by '(' and ')'\n"
+	       "                 Within the group, '-' is used for range separator,\n"
+	       "                 ',' is used for single number separator.\n"
+	       "                 '( )' can be omitted for single element group, '@' \n"
+	       "                 can be omitted if cpus and lcores has the same value\n"
 	       "  -n NUM       : Number of memory channels\n"
 	       "  -v           : Display version information on startup\n"
 	       "  -m MB        : memory to allocate (see also --"OPT_SOCKET_MEM")\n"
diff --git a/lib/librte_eal/common/eal_options.h b/lib/librte_eal/common/eal_options.h
index e476f8d..a1cc59f 100644
--- a/lib/librte_eal/common/eal_options.h
+++ b/lib/librte_eal/common/eal_options.h
@@ -77,6 +77,8 @@ enum {
 	OPT_CREATE_UIO_DEV_NUM,
 #define OPT_VFIO_INTR    "vfio-intr"
 	OPT_VFIO_INTR_NUM,
+#define OPT_LCORES "lcores"
+	OPT_LCORES_NUM,
 	OPT_LONG_MAX_NUM
 };
 
diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile
index 0e9c447..025d836 100644
--- a/lib/librte_eal/linuxapp/eal/Makefile
+++ b/lib/librte_eal/linuxapp/eal/Makefile
@@ -95,6 +95,7 @@ CFLAGS_eal_hugepage_info.o := -D_GNU_SOURCE
 CFLAGS_eal_pci.o := -D_GNU_SOURCE
 CFLAGS_eal_pci_vfio.o := -D_GNU_SOURCE
 CFLAGS_eal_common_whitelist.o := -D_GNU_SOURCE
+CFLAGS_eal_common_options.o := -D_GNU_SOURCE
 
 # workaround for a gcc bug with noreturn attribute
 # http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v2 03/15] eal: add support parsing socket_id from cpuset
  2015-01-28  6:59   ` [dpdk-dev] [PATCH v2 " Cunming Liang
  2015-01-28  6:59     ` [dpdk-dev] [PATCH v2 01/15] eal: add cpuset into per EAL thread lcore_config Cunming Liang
  2015-01-28  6:59     ` [dpdk-dev] [PATCH v2 02/15] eal: new eal option '--lcores' for cpu assignment Cunming Liang
@ 2015-01-28  6:59     ` Cunming Liang
  2015-01-28  6:59     ` [dpdk-dev] [PATCH v2 04/15] eal: new TLS definition and API declaration Cunming Liang
                       ` (12 subsequent siblings)
  15 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-28  6:59 UTC (permalink / raw)
  To: dev

It returns the socket_id if all cpus in the cpuset belongs
to the same NUMA node, otherwise it will return SOCKET_ID_ANY.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/bsdapp/eal/eal_lcore.c   |  7 +++++
 lib/librte_eal/common/eal_thread.h      | 52 +++++++++++++++++++++++++++++++++
 lib/librte_eal/linuxapp/eal/eal_lcore.c |  7 +++++
 3 files changed, 66 insertions(+)

diff --git a/lib/librte_eal/bsdapp/eal/eal_lcore.c b/lib/librte_eal/bsdapp/eal/eal_lcore.c
index 72f8ac2..162fb4f 100644
--- a/lib/librte_eal/bsdapp/eal/eal_lcore.c
+++ b/lib/librte_eal/bsdapp/eal/eal_lcore.c
@@ -41,6 +41,7 @@
 #include <rte_debug.h>
 
 #include "eal_private.h"
+#include "eal_thread.h"
 
 /* No topology information available on FreeBSD including NUMA info */
 #define cpu_core_id(X) 0
@@ -112,3 +113,9 @@ rte_eal_cpu_init(void)
 
 	return 0;
 }
+
+unsigned
+eal_cpu_socket_id(__rte_unused unsigned cpu_id)
+{
+	return cpu_socket_id(cpu_id);
+}
diff --git a/lib/librte_eal/common/eal_thread.h b/lib/librte_eal/common/eal_thread.h
index b53b84d..a25ee86 100644
--- a/lib/librte_eal/common/eal_thread.h
+++ b/lib/librte_eal/common/eal_thread.h
@@ -34,6 +34,10 @@
 #ifndef EAL_THREAD_H
 #define EAL_THREAD_H
 
+#include <sched.h>
+
+#include <rte_debug.h>
+
 /**
  * basic loop of thread, called for each thread by eal_init().
  *
@@ -50,4 +54,52 @@ __attribute__((noreturn)) void *eal_thread_loop(void *arg);
  */
 void eal_thread_init_master(unsigned lcore_id);
 
+/**
+ * Get the NUMA socket id from cpu id.
+ * This function is private to EAL.
+ *
+ * @param cpu_id
+ *   The logical process id.
+ * @return
+ *   socket_id or SOCKET_ID_ANY
+ */
+unsigned eal_cpu_socket_id(unsigned cpu_id);
+
+/**
+ * Get the NUMA socket id from cpuset.
+ * This function is private to EAL.
+ *
+ * @param cpusetp
+ *   The point to a valid cpu set.
+ * @return
+ *   socket_id or SOCKET_ID_ANY
+ */
+static inline int
+eal_cpuset_socket_id(rte_cpuset_t *cpusetp)
+{
+	unsigned cpu = 0;
+	int socket_id = SOCKET_ID_ANY;
+	int sid;
+
+	if (cpusetp == NULL)
+		return SOCKET_ID_ANY;
+
+	do {
+		if (!CPU_ISSET(cpu, cpusetp))
+			continue;
+
+		if (socket_id == SOCKET_ID_ANY)
+			socket_id = eal_cpu_socket_id(cpu);
+
+		sid = eal_cpu_socket_id(cpu);
+		if (socket_id != sid) {
+			socket_id = SOCKET_ID_ANY;
+			break;
+		}
+
+	} while (++cpu < RTE_MAX_LCORE);
+
+	return socket_id;
+}
+
 #endif /* EAL_THREAD_H */
diff --git a/lib/librte_eal/linuxapp/eal/eal_lcore.c b/lib/librte_eal/linuxapp/eal/eal_lcore.c
index 29615f8..922af6d 100644
--- a/lib/librte_eal/linuxapp/eal/eal_lcore.c
+++ b/lib/librte_eal/linuxapp/eal/eal_lcore.c
@@ -45,6 +45,7 @@
 
 #include "eal_private.h"
 #include "eal_filesystem.h"
+#include "eal_thread.h"
 
 #define SYS_CPU_DIR "/sys/devices/system/cpu/cpu%u"
 #define CORE_ID_FILE "topology/core_id"
@@ -197,3 +198,9 @@ rte_eal_cpu_init(void)
 
 	return 0;
 }
+
+unsigned
+eal_cpu_socket_id(unsigned cpu_id)
+{
+	return cpu_socket_id(cpu_id);
+}
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v2 04/15] eal: new TLS definition and API declaration
  2015-01-28  6:59   ` [dpdk-dev] [PATCH v2 " Cunming Liang
                       ` (2 preceding siblings ...)
  2015-01-28  6:59     ` [dpdk-dev] [PATCH v2 03/15] eal: add support parsing socket_id from cpuset Cunming Liang
@ 2015-01-28  6:59     ` Cunming Liang
  2015-01-28  6:59     ` [dpdk-dev] [PATCH v2 05/15] eal: add eal_common_thread.c for common thread API Cunming Liang
                       ` (11 subsequent siblings)
  15 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-28  6:59 UTC (permalink / raw)
  To: dev

1. add two TLS *_socket_id* and *_cpuset*
2. add two external API rte_thread_set/get_affinity
3. add one internal API eal_thread_dump_affinity

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/bsdapp/eal/eal_thread.c    |  2 ++
 lib/librte_eal/common/eal_thread.h        | 14 ++++++++++++++
 lib/librte_eal/common/include/rte_lcore.h | 29 +++++++++++++++++++++++++++--
 lib/librte_eal/linuxapp/eal/eal_thread.c  |  2 ++
 4 files changed, 45 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c b/lib/librte_eal/bsdapp/eal/eal_thread.c
index ab05368..10220c7 100644
--- a/lib/librte_eal/bsdapp/eal/eal_thread.c
+++ b/lib/librte_eal/bsdapp/eal/eal_thread.c
@@ -56,6 +56,8 @@
 #include "eal_thread.h"
 
 RTE_DEFINE_PER_LCORE(unsigned, _lcore_id);
+RTE_DEFINE_PER_LCORE(unsigned, _socket_id);
+RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset);
 
 /*
  * Send a message to a slave lcore identified by slave_id to call a
diff --git a/lib/librte_eal/common/eal_thread.h b/lib/librte_eal/common/eal_thread.h
index a25ee86..28edf51 100644
--- a/lib/librte_eal/common/eal_thread.h
+++ b/lib/librte_eal/common/eal_thread.h
@@ -102,4 +102,18 @@ eal_cpuset_socket_id(rte_cpuset_t *cpusetp)
 	return socket_id;
 }
 
+/**
+ * Dump the current pthread cpuset.
+ * This function is private to EAL.
+ *
+ * @param str
+ *   The string buffer the cpuset will dump to.
+ * @param size
+ *   The string buffer size.
+ */
+#define CPU_STR_LEN            256
+void
+eal_thread_dump_affinity(char str[], unsigned size);
+
+
 #endif /* EAL_THREAD_H */
diff --git a/lib/librte_eal/common/include/rte_lcore.h b/lib/librte_eal/common/include/rte_lcore.h
index 4c7d6bb..facdbdc 100644
--- a/lib/librte_eal/common/include/rte_lcore.h
+++ b/lib/librte_eal/common/include/rte_lcore.h
@@ -43,6 +43,7 @@
 #include <rte_per_lcore.h>
 #include <rte_eal.h>
 #include <rte_launch.h>
+#include <rte_memory.h>
 
 #ifdef __cplusplus
 extern "C" {
@@ -80,7 +81,9 @@ struct lcore_config {
  */
 extern struct lcore_config lcore_config[RTE_MAX_LCORE];
 
-RTE_DECLARE_PER_LCORE(unsigned, _lcore_id); /**< Per core "core id". */
+RTE_DECLARE_PER_LCORE(unsigned, _lcore_id);  /**< Per thread "lcore id". */
+RTE_DECLARE_PER_LCORE(unsigned, _socket_id); /**< Per thread "socket id". */
+RTE_DECLARE_PER_LCORE(rte_cpuset_t, _cpuset); /**< Per thread "cpuset". */
 
 /**
  * Return the ID of the execution unit we are running on.
@@ -146,7 +149,7 @@ rte_lcore_index(int lcore_id)
 static inline unsigned
 rte_socket_id(void)
 {
-	return lcore_config[rte_lcore_id()].socket_id;
+	return RTE_PER_LCORE(_socket_id);
 }
 
 /**
@@ -229,6 +232,28 @@ rte_get_next_lcore(unsigned i, int skip_master, int wrap)
 	     i<RTE_MAX_LCORE;						\
 	     i = rte_get_next_lcore(i, 1, 0))
 
+/**
+ * Set core affinity of the current thread.
+ * Support both EAL and none-EAL thread and update TLS.
+ *
+ * @param cpusetp
+ *   Point to cpu_set_t for setting current thread affinity.
+ * @return
+ *   On success, return 0; otherwise return -1;
+ */
+int rte_thread_set_affinity(rte_cpuset_t *cpusetp);
+
+/**
+ * Get core affinity of the current thread.
+ *
+ * @param cpusetp
+ *   Point to cpu_set_t for getting current thread cpu affinity.
+ * @return
+ *   On success, return 0; otherwise return -1;
+ */
+int rte_thread_get_affinity(rte_cpuset_t *cpusetp);
+
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_eal/linuxapp/eal/eal_thread.c b/lib/librte_eal/linuxapp/eal/eal_thread.c
index 80a985f..748a83a 100644
--- a/lib/librte_eal/linuxapp/eal/eal_thread.c
+++ b/lib/librte_eal/linuxapp/eal/eal_thread.c
@@ -56,6 +56,8 @@
 #include "eal_thread.h"
 
 RTE_DEFINE_PER_LCORE(unsigned, _lcore_id);
+RTE_DEFINE_PER_LCORE(unsigned, _socket_id);
+RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset);
 
 /*
  * Send a message to a slave lcore identified by slave_id to call a
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v2 05/15] eal: add eal_common_thread.c for common thread API
  2015-01-28  6:59   ` [dpdk-dev] [PATCH v2 " Cunming Liang
                       ` (3 preceding siblings ...)
  2015-01-28  6:59     ` [dpdk-dev] [PATCH v2 04/15] eal: new TLS definition and API declaration Cunming Liang
@ 2015-01-28  6:59     ` Cunming Liang
  2015-01-28  6:59     ` [dpdk-dev] [PATCH v2 06/15] eal: add rte_gettid() to acquire unique system tid Cunming Liang
                       ` (10 subsequent siblings)
  15 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-28  6:59 UTC (permalink / raw)
  To: dev

The API works for both EAL thread and none EAL thread.
When calling rte_thread_set_affinity, the *_socket_id* and
*_cpuset* of calling thread will be updated if the thread
successful set the cpu affinity.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/bsdapp/eal/Makefile        |   1 +
 lib/librte_eal/common/eal_common_thread.c | 142 ++++++++++++++++++++++++++++++
 lib/librte_eal/linuxapp/eal/Makefile      |   2 +
 3 files changed, 145 insertions(+)
 create mode 100644 lib/librte_eal/common/eal_common_thread.c

diff --git a/lib/librte_eal/bsdapp/eal/Makefile b/lib/librte_eal/bsdapp/eal/Makefile
index d434882..78406be 100644
--- a/lib/librte_eal/bsdapp/eal/Makefile
+++ b/lib/librte_eal/bsdapp/eal/Makefile
@@ -73,6 +73,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_hexdump.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_devargs.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_dev.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_options.c
+SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_thread.c
 
 CFLAGS_eal.o := -D_GNU_SOURCE
 #CFLAGS_eal_thread.o := -D_GNU_SOURCE
diff --git a/lib/librte_eal/common/eal_common_thread.c b/lib/librte_eal/common/eal_common_thread.c
new file mode 100644
index 0000000..d996690
--- /dev/null
+++ b/lib/librte_eal/common/eal_common_thread.c
@@ -0,0 +1,142 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <unistd.h>
+#include <pthread.h>
+#include <sched.h>
+
+#include <rte_lcore.h>
+#include <rte_log.h>
+#include <rte_memcpy.h>
+
+#include "eal_thread.h"
+
+int
+rte_thread_set_affinity(rte_cpuset_t *cpusetp)
+{
+	int s;
+	unsigned lcore_id;
+	pthread_t tid;
+
+	if (!cpusetp)
+		return -1;
+
+	lcore_id = rte_lcore_id();
+	if (lcore_id != (unsigned)LCORE_ID_ANY) {
+		/* EAL thread */
+		tid = lcore_config[lcore_id].thread_id;
+
+		s = pthread_setaffinity_np(tid, sizeof(rte_cpuset_t), cpusetp);
+		if (s != 0) {
+			RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
+			return -1;
+		}
+
+		/* store socket_id in TLS for quick access */
+		RTE_PER_LCORE(_socket_id) =
+			eal_cpuset_socket_id(cpusetp);
+
+		/* store cpuset in TLS for quick access */
+		rte_memcpy(&RTE_PER_LCORE(_cpuset), cpusetp,
+			   sizeof(rte_cpuset_t));
+
+		/* update lcore_config */
+		lcore_config[lcore_id].socket_id = RTE_PER_LCORE(_socket_id);
+		rte_memcpy(&lcore_config[lcore_id].cpuset, cpusetp,
+			   sizeof(rte_cpuset_t));
+	} else {
+		/* none EAL thread */
+		tid = pthread_self();
+
+		s = pthread_setaffinity_np(tid, sizeof(rte_cpuset_t), cpusetp);
+		if (s != 0) {
+			RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
+			return -1;
+		}
+
+		/* store cpuset in TLS for quick access */
+		rte_memcpy(&RTE_PER_LCORE(_cpuset), cpusetp,
+			   sizeof(rte_cpuset_t));
+
+		/* store socket_id in TLS for quick access */
+		RTE_PER_LCORE(_socket_id) =
+			eal_cpuset_socket_id(cpusetp);
+	}
+
+	return 0;
+}
+
+int
+rte_thread_get_affinity(rte_cpuset_t *cpusetp)
+{
+	if (!cpusetp)
+		return -1;
+
+	rte_memcpy(cpusetp, &RTE_PER_LCORE(_cpuset),
+		   sizeof(rte_cpuset_t));
+
+	return 0;
+}
+
+void
+eal_thread_dump_affinity(char str[], unsigned size)
+{
+	rte_cpuset_t cpuset;
+	unsigned cpu;
+	int ret;
+	unsigned int out = 0;
+
+	if (rte_thread_get_affinity(&cpuset) < 0) {
+		str[0] = '\0';
+		return;
+	}
+
+	for (cpu = 0; cpu < RTE_MAX_LCORE; cpu++) {
+		if (!CPU_ISSET(cpu, &cpuset))
+			continue;
+
+		ret = snprintf(str + out,
+			       size - out, "%u,", cpu);
+		if (ret < 0 || (unsigned)ret >= size - out)
+			break;
+
+		out += ret;
+	}
+
+	/* remove the last separator */
+	if (out > 0)
+		str[out - 1] = '\0';
+}
diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile
index 025d836..07e21ca 100644
--- a/lib/librte_eal/linuxapp/eal/Makefile
+++ b/lib/librte_eal/linuxapp/eal/Makefile
@@ -85,6 +85,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_common_hexdump.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_common_devargs.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_common_dev.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_common_options.c
+SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_common_thread.c
 
 CFLAGS_eal.o := -D_GNU_SOURCE
 CFLAGS_eal_lcore.o := -D_GNU_SOURCE
@@ -96,6 +97,7 @@ CFLAGS_eal_pci.o := -D_GNU_SOURCE
 CFLAGS_eal_pci_vfio.o := -D_GNU_SOURCE
 CFLAGS_eal_common_whitelist.o := -D_GNU_SOURCE
 CFLAGS_eal_common_options.o := -D_GNU_SOURCE
+CFLAGS_eal_common_thread.o := -D_GNU_SOURCE
 
 # workaround for a gcc bug with noreturn attribute
 # http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v2 06/15] eal: add rte_gettid() to acquire unique system tid
  2015-01-28  6:59   ` [dpdk-dev] [PATCH v2 " Cunming Liang
                       ` (4 preceding siblings ...)
  2015-01-28  6:59     ` [dpdk-dev] [PATCH v2 05/15] eal: add eal_common_thread.c for common thread API Cunming Liang
@ 2015-01-28  6:59     ` Cunming Liang
  2015-01-28  6:59     ` [dpdk-dev] [PATCH v2 07/15] eal: apply affinity of EAL thread by assigned cpuset Cunming Liang
                       ` (9 subsequent siblings)
  15 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-28  6:59 UTC (permalink / raw)
  To: dev

The rte_gettid() wraps the linux and freebsd syscall gettid().
It provides a persistent unique thread id for the calling thread.
It will save the unique id in TLS on the first time.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/bsdapp/eal/eal_thread.c   |  9 +++++++++
 lib/librte_eal/common/include/rte_eal.h  | 27 +++++++++++++++++++++++++++
 lib/librte_eal/linuxapp/eal/eal_thread.c |  7 +++++++
 3 files changed, 43 insertions(+)

diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c b/lib/librte_eal/bsdapp/eal/eal_thread.c
index 10220c7..d0c077b 100644
--- a/lib/librte_eal/bsdapp/eal/eal_thread.c
+++ b/lib/librte_eal/bsdapp/eal/eal_thread.c
@@ -39,6 +39,7 @@
 #include <sched.h>
 #include <pthread_np.h>
 #include <sys/queue.h>
+#include <sys/thr.h>
 
 #include <rte_debug.h>
 #include <rte_atomic.h>
@@ -233,3 +234,11 @@ eal_thread_loop(__attribute__((unused)) void *arg)
 	/* pthread_exit(NULL); */
 	/* return NULL; */
 }
+
+/* require calling thread tid by gettid() */
+int rte_sys_gettid(void)
+{
+	long lwpid;
+	thr_self(&lwpid);
+	return (int)lwpid;
+}
diff --git a/lib/librte_eal/common/include/rte_eal.h b/lib/librte_eal/common/include/rte_eal.h
index f4ecd2e..8ccdd65 100644
--- a/lib/librte_eal/common/include/rte_eal.h
+++ b/lib/librte_eal/common/include/rte_eal.h
@@ -41,6 +41,9 @@
  */
 
 #include <stdint.h>
+#include <sched.h>
+
+#include <rte_per_lcore.h>
 
 #ifdef __cplusplus
 extern "C" {
@@ -262,6 +265,30 @@ rte_set_application_usage_hook( rte_usage_hook_t usage_func );
  */
 int rte_eal_has_hugepages(void);
 
+/**
+ * A wrap API for syscall gettid.
+ *
+ * @return
+ *   On success, returns the thread ID of calling process.
+ *   It always successful.
+ */
+int rte_sys_gettid(void);
+
+/**
+ * Get system unique thread id.
+ *
+ * @return
+ *   On success, returns the thread ID of calling process.
+ *   It always successful.
+ */
+static inline int rte_gettid(void)
+{
+	static RTE_DEFINE_PER_LCORE(int, _thread_id) = -1;
+	if (RTE_PER_LCORE(_thread_id) == -1)
+		RTE_PER_LCORE(_thread_id) = rte_sys_gettid();
+	return RTE_PER_LCORE(_thread_id);
+}
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_eal/linuxapp/eal/eal_thread.c b/lib/librte_eal/linuxapp/eal/eal_thread.c
index 748a83a..ed20c93 100644
--- a/lib/librte_eal/linuxapp/eal/eal_thread.c
+++ b/lib/librte_eal/linuxapp/eal/eal_thread.c
@@ -39,6 +39,7 @@
 #include <pthread.h>
 #include <sched.h>
 #include <sys/queue.h>
+#include <sys/syscall.h>
 
 #include <rte_debug.h>
 #include <rte_atomic.h>
@@ -233,3 +234,9 @@ eal_thread_loop(__attribute__((unused)) void *arg)
 	/* pthread_exit(NULL); */
 	/* return NULL; */
 }
+
+/* require calling thread tid by gettid() */
+int rte_sys_gettid(void)
+{
+	return (int)syscall(SYS_gettid);
+}
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v2 07/15] eal: apply affinity of EAL thread by assigned cpuset
  2015-01-28  6:59   ` [dpdk-dev] [PATCH v2 " Cunming Liang
                       ` (5 preceding siblings ...)
  2015-01-28  6:59     ` [dpdk-dev] [PATCH v2 06/15] eal: add rte_gettid() to acquire unique system tid Cunming Liang
@ 2015-01-28  6:59     ` Cunming Liang
  2015-01-28  6:59     ` [dpdk-dev] [PATCH v2 08/15] enic: fix re-define freebsd compile complain Cunming Liang
                       ` (8 subsequent siblings)
  15 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-28  6:59 UTC (permalink / raw)
  To: dev

EAL threads use assigned cpuset to set core affinity during startup.
It keeps 1:1 mapping, if no '--lcores' option is used.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/bsdapp/eal/eal.c          | 13 ++++---
 lib/librte_eal/bsdapp/eal/eal_thread.c   | 63 +++++++++---------------------
 lib/librte_eal/linuxapp/eal/eal.c        |  7 +++-
 lib/librte_eal/linuxapp/eal/eal_thread.c | 67 +++++++++++---------------------
 4 files changed, 54 insertions(+), 96 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
index 69f3c03..98c5a83 100644
--- a/lib/librte_eal/bsdapp/eal/eal.c
+++ b/lib/librte_eal/bsdapp/eal/eal.c
@@ -432,6 +432,7 @@ rte_eal_init(int argc, char **argv)
 	int i, fctret, ret;
 	pthread_t thread_id;
 	static rte_atomic32_t run_once = RTE_ATOMIC32_INIT(0);
+	char cpuset[CPU_STR_LEN];
 
 	if (!rte_atomic32_test_and_set(&run_once))
 		return -1;
@@ -502,13 +503,17 @@ rte_eal_init(int argc, char **argv)
 	if (rte_eal_pci_init() < 0)
 		rte_panic("Cannot init PCI\n");
 
-	RTE_LOG(DEBUG, EAL, "Master core %u is ready (tid=%p)\n",
-		rte_config.master_lcore, thread_id);
-
 	eal_check_mem_on_local_socket();
 
 	rte_eal_mcfg_complete();
 
+	eal_thread_init_master(rte_config.master_lcore);
+
+	eal_thread_dump_affinity(cpuset, CPU_STR_LEN);
+
+	RTE_LOG(DEBUG, EAL, "Master lcore %u is ready (tid=%p;cpuset=[%s])\n",
+		rte_config.master_lcore, thread_id, cpuset);
+
 	if (rte_eal_dev_init() < 0)
 		rte_panic("Cannot init pmd devices\n");
 
@@ -532,8 +537,6 @@ rte_eal_init(int argc, char **argv)
 			rte_panic("Cannot create thread\n");
 	}
 
-	eal_thread_init_master(rte_config.master_lcore);
-
 	/*
 	 * Launch a dummy function on all slave lcores, so that master lcore
 	 * knows they are all ready when this function returns.
diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c b/lib/librte_eal/bsdapp/eal/eal_thread.c
index d0c077b..5b16302 100644
--- a/lib/librte_eal/bsdapp/eal/eal_thread.c
+++ b/lib/librte_eal/bsdapp/eal/eal_thread.c
@@ -103,55 +103,27 @@ eal_thread_set_affinity(void)
 {
 	int s;
 	pthread_t thread;
-
-/*
- * According to the section VERSIONS of the CPU_ALLOC man page:
- *
- * The CPU_ZERO(), CPU_SET(), CPU_CLR(), and CPU_ISSET() macros were added
- * in glibc 2.3.3.
- *
- * CPU_COUNT() first appeared in glibc 2.6.
- *
- * CPU_AND(),     CPU_OR(),     CPU_XOR(),    CPU_EQUAL(),    CPU_ALLOC(),
- * CPU_ALLOC_SIZE(), CPU_FREE(), CPU_ZERO_S(),  CPU_SET_S(),  CPU_CLR_S(),
- * CPU_ISSET_S(),  CPU_AND_S(), CPU_OR_S(), CPU_XOR_S(), and CPU_EQUAL_S()
- * first appeared in glibc 2.7.
- */
-#if defined(CPU_ALLOC)
-	size_t size;
-	cpu_set_t *cpusetp;
-
-	cpusetp = CPU_ALLOC(RTE_MAX_LCORE);
-	if (cpusetp == NULL) {
-		RTE_LOG(ERR, EAL, "CPU_ALLOC failed\n");
-		return -1;
-	}
-
-	size = CPU_ALLOC_SIZE(RTE_MAX_LCORE);
-	CPU_ZERO_S(size, cpusetp);
-	CPU_SET_S(rte_lcore_id(), size, cpusetp);
+	unsigned lcore_id = rte_lcore_id();
 
 	thread = pthread_self();
-	s = pthread_setaffinity_np(thread, size, cpusetp);
+	s = pthread_setaffinity_np(thread, sizeof(cpuset_t),
+				   &lcore_config[lcore_id].cpuset);
 	if (s != 0) {
 		RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
-		CPU_FREE(cpusetp);
 		return -1;
 	}
 
-	CPU_FREE(cpusetp);
-#else /* CPU_ALLOC */
-	cpuset_t cpuset;
-	CPU_ZERO( &cpuset );
-	CPU_SET( rte_lcore_id(), &cpuset );
+	/* acquire system unique id  */
+	rte_gettid();
+
+	/* store socket_id in TLS for quick access */
+	RTE_PER_LCORE(_socket_id) =
+		eal_cpuset_socket_id(&lcore_config[lcore_id].cpuset);
+
+	CPU_COPY(&lcore_config[lcore_id].cpuset, &RTE_PER_LCORE(_cpuset));
+
+	lcore_config[lcore_id].socket_id = RTE_PER_LCORE(_socket_id);
 
-	thread = pthread_self();
-	s = pthread_setaffinity_np(thread, sizeof( cpuset ), &cpuset);
-	if (s != 0) {
-		RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
-		return -1;
-	}
-#endif
 	return 0;
 }
 
@@ -174,6 +146,7 @@ eal_thread_loop(__attribute__((unused)) void *arg)
 	unsigned lcore_id;
 	pthread_t thread_id;
 	int m2s, s2m;
+	char cpuset[CPU_STR_LEN];
 
 	thread_id = pthread_self();
 
@@ -185,9 +158,6 @@ eal_thread_loop(__attribute__((unused)) void *arg)
 	if (lcore_id == RTE_MAX_LCORE)
 		rte_panic("cannot retrieve lcore id\n");
 
-	RTE_LOG(DEBUG, EAL, "Core %u is ready (tid=%p)\n",
-		lcore_id, thread_id);
-
 	m2s = lcore_config[lcore_id].pipe_master2slave[0];
 	s2m = lcore_config[lcore_id].pipe_slave2master[1];
 
@@ -198,6 +168,11 @@ eal_thread_loop(__attribute__((unused)) void *arg)
 	if (eal_thread_set_affinity() < 0)
 		rte_panic("cannot set affinity\n");
 
+	eal_thread_dump_affinity(cpuset, CPU_STR_LEN);
+
+	RTE_LOG(DEBUG, EAL, "lcore %u is ready (tid=%p;cpuset=[%s])\n",
+		lcore_id, thread_id, cpuset);
+
 	/* read on our pipe to get commands */
 	while (1) {
 		void *fct_arg;
diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index f99e158..c95adec 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -702,6 +702,7 @@ rte_eal_init(int argc, char **argv)
 	static rte_atomic32_t run_once = RTE_ATOMIC32_INIT(0);
 	struct shared_driver *solib = NULL;
 	const char *logid;
+	char cpuset[CPU_STR_LEN];
 
 	if (!rte_atomic32_test_and_set(&run_once))
 		return -1;
@@ -802,8 +803,10 @@ rte_eal_init(int argc, char **argv)
 
 	eal_thread_init_master(rte_config.master_lcore);
 
-	RTE_LOG(DEBUG, EAL, "Master core %u is ready (tid=%x)\n",
-		rte_config.master_lcore, (int)thread_id);
+	eal_thread_dump_affinity(cpuset, CPU_STR_LEN);
+
+	RTE_LOG(DEBUG, EAL, "Master lcore %u is ready (tid=%x;cpuset=[%s])\n",
+		rte_config.master_lcore, (int)thread_id, cpuset);
 
 	if (rte_eal_dev_init() < 0)
 		rte_panic("Cannot init pmd devices\n");
diff --git a/lib/librte_eal/linuxapp/eal/eal_thread.c b/lib/librte_eal/linuxapp/eal/eal_thread.c
index ed20c93..6eb1525 100644
--- a/lib/librte_eal/linuxapp/eal/eal_thread.c
+++ b/lib/librte_eal/linuxapp/eal/eal_thread.c
@@ -52,6 +52,7 @@
 #include <rte_eal.h>
 #include <rte_per_lcore.h>
 #include <rte_lcore.h>
+#include <rte_memcpy.h>
 
 #include "eal_private.h"
 #include "eal_thread.h"
@@ -97,61 +98,34 @@ rte_eal_remote_launch(int (*f)(void *), void *arg, unsigned slave_id)
 	return 0;
 }
 
-/* set affinity for current thread */
+/* set affinity for current EAL thread */
 static int
 eal_thread_set_affinity(void)
 {
 	int s;
 	pthread_t thread;
-
-/*
- * According to the section VERSIONS of the CPU_ALLOC man page:
- *
- * The CPU_ZERO(), CPU_SET(), CPU_CLR(), and CPU_ISSET() macros were added
- * in glibc 2.3.3.
- *
- * CPU_COUNT() first appeared in glibc 2.6.
- *
- * CPU_AND(),     CPU_OR(),     CPU_XOR(),    CPU_EQUAL(),    CPU_ALLOC(),
- * CPU_ALLOC_SIZE(), CPU_FREE(), CPU_ZERO_S(),  CPU_SET_S(),  CPU_CLR_S(),
- * CPU_ISSET_S(),  CPU_AND_S(), CPU_OR_S(), CPU_XOR_S(), and CPU_EQUAL_S()
- * first appeared in glibc 2.7.
- */
-#if defined(CPU_ALLOC)
-	size_t size;
-	cpu_set_t *cpusetp;
-
-	cpusetp = CPU_ALLOC(RTE_MAX_LCORE);
-	if (cpusetp == NULL) {
-		RTE_LOG(ERR, EAL, "CPU_ALLOC failed\n");
-		return -1;
-	}
-
-	size = CPU_ALLOC_SIZE(RTE_MAX_LCORE);
-	CPU_ZERO_S(size, cpusetp);
-	CPU_SET_S(rte_lcore_id(), size, cpusetp);
+	unsigned lcore_id = rte_lcore_id();
 
 	thread = pthread_self();
-	s = pthread_setaffinity_np(thread, size, cpusetp);
+	s = pthread_setaffinity_np(thread, sizeof(cpu_set_t),
+				   &lcore_config[lcore_id].cpuset);
 	if (s != 0) {
 		RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
-		CPU_FREE(cpusetp);
 		return -1;
 	}
 
-	CPU_FREE(cpusetp);
-#else /* CPU_ALLOC */
-	cpu_set_t cpuset;
-	CPU_ZERO( &cpuset );
-	CPU_SET( rte_lcore_id(), &cpuset );
+	/* acquire system unique id  */
+	rte_gettid();
+
+	/* store socket_id in TLS for quick access */
+	RTE_PER_LCORE(_socket_id) =
+		eal_cpuset_socket_id(&lcore_config[lcore_id].cpuset);
+
+	rte_memcpy(&RTE_PER_LCORE(_cpuset),
+		   &lcore_config[lcore_id].cpuset, sizeof(rte_cpuset_t));
+
+	lcore_config[lcore_id].socket_id = RTE_PER_LCORE(_socket_id);
 
-	thread = pthread_self();
-	s = pthread_setaffinity_np(thread, sizeof( cpuset ), &cpuset);
-	if (s != 0) {
-		RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
-		return -1;
-	}
-#endif
 	return 0;
 }
 
@@ -174,6 +148,7 @@ eal_thread_loop(__attribute__((unused)) void *arg)
 	unsigned lcore_id;
 	pthread_t thread_id;
 	int m2s, s2m;
+	char cpuset[CPU_STR_LEN];
 
 	thread_id = pthread_self();
 
@@ -185,9 +160,6 @@ eal_thread_loop(__attribute__((unused)) void *arg)
 	if (lcore_id == RTE_MAX_LCORE)
 		rte_panic("cannot retrieve lcore id\n");
 
-	RTE_LOG(DEBUG, EAL, "Core %u is ready (tid=%x)\n",
-		lcore_id, (int)thread_id);
-
 	m2s = lcore_config[lcore_id].pipe_master2slave[0];
 	s2m = lcore_config[lcore_id].pipe_slave2master[1];
 
@@ -198,6 +170,11 @@ eal_thread_loop(__attribute__((unused)) void *arg)
 	if (eal_thread_set_affinity() < 0)
 		rte_panic("cannot set affinity\n");
 
+	eal_thread_dump_affinity(cpuset, CPU_STR_LEN);
+
+	RTE_LOG(DEBUG, EAL, "lcore %u is ready (tid=%x;cpuset=[%s])\n",
+		lcore_id, (int)thread_id, cpuset);
+
 	/* read on our pipe to get commands */
 	while (1) {
 		void *fct_arg;
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v2 08/15] enic: fix re-define freebsd compile complain
  2015-01-28  6:59   ` [dpdk-dev] [PATCH v2 " Cunming Liang
                       ` (6 preceding siblings ...)
  2015-01-28  6:59     ` [dpdk-dev] [PATCH v2 07/15] eal: apply affinity of EAL thread by assigned cpuset Cunming Liang
@ 2015-01-28  6:59     ` Cunming Liang
  2015-01-28  6:59     ` [dpdk-dev] [PATCH v2 09/15] malloc: fix the issue of SOCKET_ID_ANY Cunming Liang
                       ` (7 subsequent siblings)
  15 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-28  6:59 UTC (permalink / raw)
  To: dev

Some macro already been defined by freebsd 'sys/param.h'.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_pmd_enic/enic.h        | 1 +
 lib/librte_pmd_enic/enic_compat.h | 1 +
 2 files changed, 2 insertions(+)

diff --git a/lib/librte_pmd_enic/enic.h b/lib/librte_pmd_enic/enic.h
index c43417c..189c3b9 100644
--- a/lib/librte_pmd_enic/enic.h
+++ b/lib/librte_pmd_enic/enic.h
@@ -66,6 +66,7 @@
 #define ENIC_CALC_IP_CKSUM      1
 #define ENIC_CALC_TCP_UDP_CKSUM 2
 #define ENIC_MAX_MTU            9000
+#undef PAGE_SIZE
 #define PAGE_SIZE               4096
 #define PAGE_ROUND_UP(x) \
 	((((unsigned long)(x)) + PAGE_SIZE-1) & (~(PAGE_SIZE-1)))
diff --git a/lib/librte_pmd_enic/enic_compat.h b/lib/librte_pmd_enic/enic_compat.h
index b1af838..b84c766 100644
--- a/lib/librte_pmd_enic/enic_compat.h
+++ b/lib/librte_pmd_enic/enic_compat.h
@@ -67,6 +67,7 @@
 #define pr_warn(y, args...) dev_warning(0, y, ##args)
 #define BUG() pr_err("BUG at %s:%d", __func__, __LINE__)
 
+#undef ALIGN
 #define ALIGN(x, a)              __ALIGN_MASK(x, (typeof(x))(a)-1)
 #define __ALIGN_MASK(x, mask)    (((x)+(mask))&~(mask))
 #define udelay usleep
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v2 09/15] malloc: fix the issue of SOCKET_ID_ANY
  2015-01-28  6:59   ` [dpdk-dev] [PATCH v2 " Cunming Liang
                       ` (7 preceding siblings ...)
  2015-01-28  6:59     ` [dpdk-dev] [PATCH v2 08/15] enic: fix re-define freebsd compile complain Cunming Liang
@ 2015-01-28  6:59     ` Cunming Liang
  2015-01-28  6:59     ` [dpdk-dev] [PATCH v2 10/15] log: fix the gap to support non-EAL thread Cunming Liang
                       ` (6 subsequent siblings)
  15 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-28  6:59 UTC (permalink / raw)
  To: dev

Add check for rte_socket_id(), avoid get unexpected return like (-1).

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_malloc/malloc_heap.h | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/lib/librte_malloc/malloc_heap.h b/lib/librte_malloc/malloc_heap.h
index b4aec45..a47136d 100644
--- a/lib/librte_malloc/malloc_heap.h
+++ b/lib/librte_malloc/malloc_heap.h
@@ -44,7 +44,12 @@ extern "C" {
 static inline unsigned
 malloc_get_numa_socket(void)
 {
-	return rte_socket_id();
+	unsigned socket_id = rte_socket_id();
+
+	if (socket_id == (unsigned)SOCKET_ID_ANY)
+		return 0;
+
+	return socket_id;
 }
 
 void *
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v2 10/15] log: fix the gap to support non-EAL thread
  2015-01-28  6:59   ` [dpdk-dev] [PATCH v2 " Cunming Liang
                       ` (8 preceding siblings ...)
  2015-01-28  6:59     ` [dpdk-dev] [PATCH v2 09/15] malloc: fix the issue of SOCKET_ID_ANY Cunming Liang
@ 2015-01-28  6:59     ` Cunming Liang
  2015-01-28  6:59     ` [dpdk-dev] [PATCH v2 11/15] eal: set _lcore_id and _socket_id to (-1) by default Cunming Liang
                       ` (5 subsequent siblings)
  15 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-28  6:59 UTC (permalink / raw)
  To: dev

For those non-EAL thread, *_lcore_id* is invalid and probably larger than RTE_MAX_LCORE.
The patch adds the check and allows only EAL thread using EAL per thread log level and log type.
Others shares the global log level.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/common/eal_common_log.c  | 17 +++++++++++++++--
 lib/librte_eal/common/include/rte_log.h |  5 +++++
 2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_log.c b/lib/librte_eal/common/eal_common_log.c
index cf57619..e8dc94a 100644
--- a/lib/librte_eal/common/eal_common_log.c
+++ b/lib/librte_eal/common/eal_common_log.c
@@ -193,11 +193,20 @@ rte_set_log_type(uint32_t type, int enable)
 		rte_logs.type &= (~type);
 }
 
+/* Get global log type */
+uint32_t
+rte_get_log_type(void)
+{
+	return rte_logs.type;
+}
+
 /* get the current loglevel for the message beeing processed */
 int rte_log_cur_msg_loglevel(void)
 {
 	unsigned lcore_id;
 	lcore_id = rte_lcore_id();
+	if (lcore_id >= RTE_MAX_LCORE)
+		return rte_get_log_level();
 	return log_cur_msg[lcore_id].loglevel;
 }
 
@@ -206,6 +215,8 @@ int rte_log_cur_msg_logtype(void)
 {
 	unsigned lcore_id;
 	lcore_id = rte_lcore_id();
+	if (lcore_id >= RTE_MAX_LCORE)
+		return rte_get_log_type();
 	return log_cur_msg[lcore_id].logtype;
 }
 
@@ -265,8 +276,10 @@ rte_vlog(__attribute__((unused)) uint32_t level,
 
 	/* save loglevel and logtype in a global per-lcore variable */
 	lcore_id = rte_lcore_id();
-	log_cur_msg[lcore_id].loglevel = level;
-	log_cur_msg[lcore_id].logtype = logtype;
+	if (lcore_id < RTE_MAX_LCORE) {
+		log_cur_msg[lcore_id].loglevel = level;
+		log_cur_msg[lcore_id].logtype = logtype;
+	}
 
 	ret = vfprintf(f, format, ap);
 	fflush(f);
diff --git a/lib/librte_eal/common/include/rte_log.h b/lib/librte_eal/common/include/rte_log.h
index db1ea08..f83a0d9 100644
--- a/lib/librte_eal/common/include/rte_log.h
+++ b/lib/librte_eal/common/include/rte_log.h
@@ -144,6 +144,11 @@ uint32_t rte_get_log_level(void);
 void rte_set_log_type(uint32_t type, int enable);
 
 /**
+ * Get the global log type.
+ */
+uint32_t rte_get_log_type(void);
+
+/**
  * Get the current loglevel for the message being processed.
  *
  * Before calling the user-defined stream for logging, the log
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v2 11/15] eal: set _lcore_id and _socket_id to (-1) by default
  2015-01-28  6:59   ` [dpdk-dev] [PATCH v2 " Cunming Liang
                       ` (9 preceding siblings ...)
  2015-01-28  6:59     ` [dpdk-dev] [PATCH v2 10/15] log: fix the gap to support non-EAL thread Cunming Liang
@ 2015-01-28  6:59     ` Cunming Liang
  2015-01-28  6:59     ` [dpdk-dev] [PATCH v2 12/15] eal: fix recursive spinlock in non-EAL thraed Cunming Liang
                       ` (4 subsequent siblings)
  15 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-28  6:59 UTC (permalink / raw)
  To: dev

For those none EAL thread, *_lcore_id* shall always be LCORE_ID_ANY.
The libraries using *_lcore_id* as index need to take care.
*_socket_id* always be SOCKET_ID_ANY unitl the thread changes the affinity
by rte_thread_set_affinity()

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/bsdapp/eal/eal_thread.c   | 4 ++--
 lib/librte_eal/linuxapp/eal/eal_thread.c | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c b/lib/librte_eal/bsdapp/eal/eal_thread.c
index 5b16302..2b3c9a8 100644
--- a/lib/librte_eal/bsdapp/eal/eal_thread.c
+++ b/lib/librte_eal/bsdapp/eal/eal_thread.c
@@ -56,8 +56,8 @@
 #include "eal_private.h"
 #include "eal_thread.h"
 
-RTE_DEFINE_PER_LCORE(unsigned, _lcore_id);
-RTE_DEFINE_PER_LCORE(unsigned, _socket_id);
+RTE_DEFINE_PER_LCORE(unsigned, _lcore_id) = (unsigned)LCORE_ID_ANY;
+RTE_DEFINE_PER_LCORE(unsigned, _socket_id) = (unsigned)SOCKET_ID_ANY;
 RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset);
 
 /*
diff --git a/lib/librte_eal/linuxapp/eal/eal_thread.c b/lib/librte_eal/linuxapp/eal/eal_thread.c
index 6eb1525..ab94e20 100644
--- a/lib/librte_eal/linuxapp/eal/eal_thread.c
+++ b/lib/librte_eal/linuxapp/eal/eal_thread.c
@@ -57,8 +57,8 @@
 #include "eal_private.h"
 #include "eal_thread.h"
 
-RTE_DEFINE_PER_LCORE(unsigned, _lcore_id);
-RTE_DEFINE_PER_LCORE(unsigned, _socket_id);
+RTE_DEFINE_PER_LCORE(unsigned, _lcore_id) = (unsigned)LCORE_ID_ANY;
+RTE_DEFINE_PER_LCORE(unsigned, _socket_id) = (unsigned)SOCKET_ID_ANY;
 RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset);
 
 /*
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v2 12/15] eal: fix recursive spinlock in non-EAL thraed
  2015-01-28  6:59   ` [dpdk-dev] [PATCH v2 " Cunming Liang
                       ` (10 preceding siblings ...)
  2015-01-28  6:59     ` [dpdk-dev] [PATCH v2 11/15] eal: set _lcore_id and _socket_id to (-1) by default Cunming Liang
@ 2015-01-28  6:59     ` Cunming Liang
  2015-01-28  6:59     ` [dpdk-dev] [PATCH v2 13/15] mempool: add support to non-EAL thread Cunming Liang
                       ` (3 subsequent siblings)
  15 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-28  6:59 UTC (permalink / raw)
  To: dev

In non-EAL thread, lcore_id alrways be LCORE_ID_ANY.
It cann't be used as unique id for recursive spinlock.
Then use rte_gettid() to replace it.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/common/include/generic/rte_spinlock.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/include/generic/rte_spinlock.h b/lib/librte_eal/common/include/generic/rte_spinlock.h
index dea885c..c7fb0df 100644
--- a/lib/librte_eal/common/include/generic/rte_spinlock.h
+++ b/lib/librte_eal/common/include/generic/rte_spinlock.h
@@ -179,7 +179,7 @@ static inline void rte_spinlock_recursive_init(rte_spinlock_recursive_t *slr)
  */
 static inline void rte_spinlock_recursive_lock(rte_spinlock_recursive_t *slr)
 {
-	int id = rte_lcore_id();
+	int id = rte_gettid();
 
 	if (slr->user != id) {
 		rte_spinlock_lock(&slr->sl);
@@ -212,7 +212,7 @@ static inline void rte_spinlock_recursive_unlock(rte_spinlock_recursive_t *slr)
  */
 static inline int rte_spinlock_recursive_trylock(rte_spinlock_recursive_t *slr)
 {
-	int id = rte_lcore_id();
+	int id = rte_gettid();
 
 	if (slr->user != id) {
 		if (rte_spinlock_trylock(&slr->sl) == 0)
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v2 13/15] mempool: add support to non-EAL thread
  2015-01-28  6:59   ` [dpdk-dev] [PATCH v2 " Cunming Liang
                       ` (11 preceding siblings ...)
  2015-01-28  6:59     ` [dpdk-dev] [PATCH v2 12/15] eal: fix recursive spinlock in non-EAL thraed Cunming Liang
@ 2015-01-28  6:59     ` Cunming Liang
  2015-01-28  6:59     ` [dpdk-dev] [PATCH v2 14/15] ring: " Cunming Liang
                       ` (2 subsequent siblings)
  15 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-28  6:59 UTC (permalink / raw)
  To: dev

For non-EAL thread, bypass per lcore cache, directly use ring pool.
It allows using rte_mempool in either EAL thread or any user pthread.
As in non-EAL thread, it directly rely on rte_ring and it's none preemptive.
It doesn't suggest to run multi-pthread/cpu which compete the rte_mempool.
It will get bad performance and has critical risk if scheduling policy is RT.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_mempool/rte_mempool.h | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 3314651..4845f27 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -198,10 +198,12 @@ struct rte_mempool {
  *   Number to add to the object-oriented statistics.
  */
 #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
-#define __MEMPOOL_STAT_ADD(mp, name, n) do {			\
-		unsigned __lcore_id = rte_lcore_id();		\
-		mp->stats[__lcore_id].name##_objs += n;		\
-		mp->stats[__lcore_id].name##_bulk += 1;		\
+#define __MEMPOOL_STAT_ADD(mp, name, n) do {                    \
+		unsigned __lcore_id = rte_lcore_id();           \
+		if (__lcore_id < RTE_MAX_LCORE) {               \
+			mp->stats[__lcore_id].name##_objs += n;	\
+			mp->stats[__lcore_id].name##_bulk += 1;	\
+		}                                               \
 	} while(0)
 #else
 #define __MEMPOOL_STAT_ADD(mp, name, n) do {} while(0)
@@ -767,8 +769,9 @@ __mempool_put_bulk(struct rte_mempool *mp, void * const *obj_table,
 	__MEMPOOL_STAT_ADD(mp, put, n);
 
 #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0
-	/* cache is not enabled or single producer */
-	if (unlikely(cache_size == 0 || is_mp == 0))
+	/* cache is not enabled or single producer or none EAL thread */
+	if (unlikely(cache_size == 0 || is_mp == 0 ||
+		     lcore_id >= RTE_MAX_LCORE))
 		goto ring_enqueue;
 
 	/* Go straight to ring if put would overflow mem allocated for cache */
@@ -952,7 +955,8 @@ __mempool_get_bulk(struct rte_mempool *mp, void **obj_table,
 	uint32_t cache_size = mp->cache_size;
 
 	/* cache is not enabled or single consumer */
-	if (unlikely(cache_size == 0 || is_mc == 0 || n >= cache_size))
+	if (unlikely(cache_size == 0 || is_mc == 0 ||
+		     n >= cache_size || lcore_id >= RTE_MAX_LCORE))
 		goto ring_dequeue;
 
 	cache = &mp->local_cache[lcore_id];
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v2 14/15] ring: add support to non-EAL thread
  2015-01-28  6:59   ` [dpdk-dev] [PATCH v2 " Cunming Liang
                       ` (12 preceding siblings ...)
  2015-01-28  6:59     ` [dpdk-dev] [PATCH v2 13/15] mempool: add support to non-EAL thread Cunming Liang
@ 2015-01-28  6:59     ` Cunming Liang
  2015-01-28  6:59     ` [dpdk-dev] [PATCH v2 15/15] timer: " Cunming Liang
  2015-01-29  0:24     ` [dpdk-dev] [PATCH v3 00/16] support multi-pthread per core Cunming Liang
  15 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-28  6:59 UTC (permalink / raw)
  To: dev

ring debug stat won't take care non-EAL thread.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_ring/rte_ring.h | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index 7cd5f2d..39bacdd 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -188,10 +188,12 @@ struct rte_ring {
  *   The number to add to the object-oriented statistics.
  */
 #ifdef RTE_LIBRTE_RING_DEBUG
-#define __RING_STAT_ADD(r, name, n) do {		\
-		unsigned __lcore_id = rte_lcore_id();	\
-		r->stats[__lcore_id].name##_objs += n;	\
-		r->stats[__lcore_id].name##_bulk += 1;	\
+#define __RING_STAT_ADD(r, name, n) do {                        \
+		unsigned __lcore_id = rte_lcore_id();           \
+		if (__lcore_id < RTE_MAX_LCORE) {               \
+			r->stats[__lcore_id].name##_objs += n;  \
+			r->stats[__lcore_id].name##_bulk += 1;  \
+		}                                               \
 	} while(0)
 #else
 #define __RING_STAT_ADD(r, name, n) do {} while(0)
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v2 15/15] timer: add support to non-EAL thread
  2015-01-28  6:59   ` [dpdk-dev] [PATCH v2 " Cunming Liang
                       ` (13 preceding siblings ...)
  2015-01-28  6:59     ` [dpdk-dev] [PATCH v2 14/15] ring: " Cunming Liang
@ 2015-01-28  6:59     ` Cunming Liang
  2015-01-29  0:24     ` [dpdk-dev] [PATCH v3 00/16] support multi-pthread per core Cunming Liang
  15 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-28  6:59 UTC (permalink / raw)
  To: dev

Allow to setup timers only for EAL (lcore) threads (__lcore_id < MAX_LCORE_ID).
E.g. – dynamically created thread will be able to reset/stop timer for lcore thread,
but it will be not allowed to setup timer for itself or another non-lcore thread.
rte_timer_manage() for non-lcore thread would simply do nothing and return straightway.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_timer/rte_timer.c | 40 +++++++++++++++++++++++++++++++---------
 lib/librte_timer/rte_timer.h |  2 +-
 2 files changed, 32 insertions(+), 10 deletions(-)

diff --git a/lib/librte_timer/rte_timer.c b/lib/librte_timer/rte_timer.c
index 269a992..601c159 100644
--- a/lib/librte_timer/rte_timer.c
+++ b/lib/librte_timer/rte_timer.c
@@ -79,9 +79,10 @@ static struct priv_timer priv_timer[RTE_MAX_LCORE];
 
 /* when debug is enabled, store some statistics */
 #ifdef RTE_LIBRTE_TIMER_DEBUG
-#define __TIMER_STAT_ADD(name, n) do {				\
-		unsigned __lcore_id = rte_lcore_id();		\
-		priv_timer[__lcore_id].stats.name += (n);	\
+#define __TIMER_STAT_ADD(name, n) do {					\
+		unsigned __lcore_id = rte_lcore_id();			\
+		if (__lcore_id < RTE_MAX_LCORE)				\
+			priv_timer[__lcore_id].stats.name += (n);	\
 	} while(0)
 #else
 #define __TIMER_STAT_ADD(name, n) do {} while(0)
@@ -127,15 +128,26 @@ timer_set_config_state(struct rte_timer *tim,
 	unsigned lcore_id;
 
 	lcore_id = rte_lcore_id();
+	if (lcore_id >= RTE_MAX_LCORE)
+		lcore_id = LCORE_ID_ANY;
 
 	/* wait that the timer is in correct status before update,
 	 * and mark it as being configured */
 	while (success == 0) {
 		prev_status.u32 = tim->status.u32;
 
+		/*
+		 * prevent race condition of non-EAL threads
+		 * to update the timer. When 'owner == LCORE_ID_ANY',
+		 * it means updated by a non-EAL thread.
+		 */
+		if (lcore_id == (unsigned)LCORE_ID_ANY &&
+		    (uint16_t)lcore_id == prev_status.owner)
+			return -1;
+
 		/* timer is running on another core, exit */
 		if (prev_status.state == RTE_TIMER_RUNNING &&
-		    (unsigned)prev_status.owner != lcore_id)
+		    prev_status.owner != (uint16_t)lcore_id)
 			return -1;
 
 		/* timer is being configured on another core */
@@ -366,9 +378,13 @@ __rte_timer_reset(struct rte_timer *tim, uint64_t expire,
 
 	/* round robin for tim_lcore */
 	if (tim_lcore == (unsigned)LCORE_ID_ANY) {
-		tim_lcore = rte_get_next_lcore(priv_timer[lcore_id].prev_lcore,
-					       0, 1);
-		priv_timer[lcore_id].prev_lcore = tim_lcore;
+		if (lcore_id < RTE_MAX_LCORE) {
+			tim_lcore = rte_get_next_lcore(
+				priv_timer[lcore_id].prev_lcore,
+				0, 1);
+			priv_timer[lcore_id].prev_lcore = tim_lcore;
+		} else
+			tim_lcore = rte_get_next_lcore(LCORE_ID_ANY, 0, 1);
 	}
 
 	/* wait that the timer is in correct status before update,
@@ -378,7 +394,8 @@ __rte_timer_reset(struct rte_timer *tim, uint64_t expire,
 		return -1;
 
 	__TIMER_STAT_ADD(reset, 1);
-	if (prev_status.state == RTE_TIMER_RUNNING) {
+	if (prev_status.state == RTE_TIMER_RUNNING &&
+	    lcore_id < RTE_MAX_LCORE) {
 		priv_timer[lcore_id].updated = 1;
 	}
 
@@ -455,7 +472,8 @@ rte_timer_stop(struct rte_timer *tim)
 		return -1;
 
 	__TIMER_STAT_ADD(stop, 1);
-	if (prev_status.state == RTE_TIMER_RUNNING) {
+	if (prev_status.state == RTE_TIMER_RUNNING &&
+	    lcore_id < RTE_MAX_LCORE) {
 		priv_timer[lcore_id].updated = 1;
 	}
 
@@ -499,6 +517,10 @@ void rte_timer_manage(void)
 	uint64_t cur_time;
 	int i, ret;
 
+	/* timer manager only runs on EAL thread */
+	if (lcore_id >= RTE_MAX_LCORE)
+		return;
+
 	__TIMER_STAT_ADD(manage, 1);
 	/* optimize for the case where per-cpu list is empty */
 	if (priv_timer[lcore_id].pending_head.sl_next[0] == NULL)
diff --git a/lib/librte_timer/rte_timer.h b/lib/librte_timer/rte_timer.h
index 4907cf5..5c5df91 100644
--- a/lib/librte_timer/rte_timer.h
+++ b/lib/librte_timer/rte_timer.h
@@ -76,7 +76,7 @@ extern "C" {
 #define RTE_TIMER_RUNNING 2 /**< State: timer function is running. */
 #define RTE_TIMER_CONFIG  3 /**< State: timer is being configured. */
 
-#define RTE_TIMER_NO_OWNER -1 /**< Timer has no owner. */
+#define RTE_TIMER_NO_OWNER -2 /**< Timer has no owner. */
 
 /**
  * Timer type: Periodic or single (one-shot).
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v3 00/16] support multi-pthread per core
  2015-01-28  6:59   ` [dpdk-dev] [PATCH v2 " Cunming Liang
                       ` (14 preceding siblings ...)
  2015-01-28  6:59     ` [dpdk-dev] [PATCH v2 15/15] timer: " Cunming Liang
@ 2015-01-29  0:24     ` Cunming Liang
  2015-01-29  0:24       ` [dpdk-dev] [PATCH v3 01/16] eal: add cpuset into per EAL thread lcore_config Cunming Liang
                         ` (16 more replies)
  15 siblings, 17 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-29  0:24 UTC (permalink / raw)
  To: dev

v3 changes:
  add sched_yield() in rte_ring to avoid long spin [15/16]

v2 changes:
  add '<number>-<number>' support for EAL option '--lcores'

The patch series contain the enhancements of EAL and fixes for libraries
to run multi-pthreads(either EAL or non-EAL thread) per physical core.
Two major changes list as below:
- Extend the core affinity of each EAL thread to 1:n.
  Each lcore stands for a EAL thread rather than a logical core.
  The change adds new EAL option to allow static lcore to cpuset assginment.
  Then a lcore(EAL thread) affinity to a cpuset, original 1:1 mapping is the special case.
- Fix the libraries to allow running on any non-EAL thread.
  It fix the gaps running libraries in non-EAL thread(dynamic created by user).
  Each fix libraries take care the case of rte_lcore_id() >= RTE_MAX_LCORE.

Thanks a million for the comments from Konstantin, Bruce, Mirek and Stephen in RFC review.



*** BLURB HERE ***

Cunming Liang (16):
  eal: add cpuset into per EAL thread lcore_config
  eal: new eal option '--lcores' for cpu assignment
  eal: add support parsing socket_id from cpuset
  eal: new TLS definition and API declaration
  eal: add eal_common_thread.c for common thread API
  eal: add rte_gettid() to acquire unique system tid
  eal: apply affinity of EAL thread by assigned cpuset
  enic: fix re-define freebsd compile complain
  malloc: fix the issue of SOCKET_ID_ANY
  log: fix the gap to support non-EAL thread
  eal: set _lcore_id and _socket_id to (-1) by default
  eal: fix recursive spinlock in non-EAL thraed
  mempool: add support to non-EAL thread
  ring: add support to non-EAL thread
  ring: add sched_yield to avoid spin forever
  timer: add support to non-EAL thread

 lib/librte_eal/bsdapp/eal/Makefile                 |   1 +
 lib/librte_eal/bsdapp/eal/eal.c                    |  13 +-
 lib/librte_eal/bsdapp/eal/eal_lcore.c              |  14 +
 lib/librte_eal/bsdapp/eal/eal_memory.c             |   2 +
 lib/librte_eal/bsdapp/eal/eal_thread.c             |  76 +++---
 lib/librte_eal/common/eal_common_launch.c          |   1 -
 lib/librte_eal/common/eal_common_log.c             |  17 +-
 lib/librte_eal/common/eal_common_options.c         | 300 ++++++++++++++++++++-
 lib/librte_eal/common/eal_common_thread.c          | 142 ++++++++++
 lib/librte_eal/common/eal_options.h                |   2 +
 lib/librte_eal/common/eal_thread.h                 |  66 +++++
 .../common/include/generic/rte_spinlock.h          |   4 +-
 lib/librte_eal/common/include/rte_eal.h            |  27 ++
 lib/librte_eal/common/include/rte_lcore.h          |  37 ++-
 lib/librte_eal/common/include/rte_log.h            |   5 +
 lib/librte_eal/linuxapp/eal/Makefile               |   4 +
 lib/librte_eal/linuxapp/eal/eal.c                  |   7 +-
 lib/librte_eal/linuxapp/eal/eal_lcore.c            |  15 ++
 lib/librte_eal/linuxapp/eal/eal_thread.c           |  78 +++---
 lib/librte_malloc/malloc_heap.h                    |   7 +-
 lib/librte_mempool/rte_mempool.h                   |  18 +-
 lib/librte_pmd_enic/enic.h                         |   1 +
 lib/librte_pmd_enic/enic_compat.h                  |   1 +
 lib/librte_ring/rte_ring.h                         |  35 ++-
 lib/librte_timer/rte_timer.c                       |  40 ++-
 lib/librte_timer/rte_timer.h                       |   2 +-
 26 files changed, 778 insertions(+), 137 deletions(-)
 create mode 100644 lib/librte_eal/common/eal_common_thread.c

-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v3 01/16] eal: add cpuset into per EAL thread lcore_config
  2015-01-29  0:24     ` [dpdk-dev] [PATCH v3 00/16] support multi-pthread per core Cunming Liang
@ 2015-01-29  0:24       ` Cunming Liang
  2015-01-29  0:24       ` [dpdk-dev] [PATCH v3 02/16] eal: new eal option '--lcores' for cpu assignment Cunming Liang
                         ` (15 subsequent siblings)
  16 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-29  0:24 UTC (permalink / raw)
  To: dev

The patch adds 'cpuset' into per-lcore configure 'lcore_config[]',
as the lcore no longer always 1:1 pinning with physical cpu.
The lcore now stands for a EAL thread rather than a logical cpu.

It doesn't change the default behavior of 1:1 mapping, but allows to
affinity the EAL thread to multiple cpus.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/bsdapp/eal/eal_lcore.c     | 7 +++++++
 lib/librte_eal/bsdapp/eal/eal_memory.c    | 2 ++
 lib/librte_eal/common/include/rte_lcore.h | 8 ++++++++
 lib/librte_eal/linuxapp/eal/Makefile      | 1 +
 lib/librte_eal/linuxapp/eal/eal_lcore.c   | 8 ++++++++
 5 files changed, 26 insertions(+)

diff --git a/lib/librte_eal/bsdapp/eal/eal_lcore.c b/lib/librte_eal/bsdapp/eal/eal_lcore.c
index 662f024..72f8ac2 100644
--- a/lib/librte_eal/bsdapp/eal/eal_lcore.c
+++ b/lib/librte_eal/bsdapp/eal/eal_lcore.c
@@ -76,11 +76,18 @@ rte_eal_cpu_init(void)
 	 * ones and enable them by default.
 	 */
 	for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
+		/* init cpuset for per lcore config */
+		CPU_ZERO(&lcore_config[lcore_id].cpuset);
+
 		lcore_config[lcore_id].detected = (lcore_id < ncpus);
 		if (lcore_config[lcore_id].detected == 0) {
 			config->lcore_role[lcore_id] = ROLE_OFF;
 			continue;
 		}
+
+		/* By default, lcore 1:1 map to cpu id */
+		CPU_SET(lcore_id, &lcore_config[lcore_id].cpuset);
+
 		/* By default, each detected core is enabled */
 		config->lcore_role[lcore_id] = ROLE_RTE;
 		lcore_config[lcore_id].core_id = cpu_core_id(lcore_id);
diff --git a/lib/librte_eal/bsdapp/eal/eal_memory.c b/lib/librte_eal/bsdapp/eal/eal_memory.c
index 65ee87d..a34d500 100644
--- a/lib/librte_eal/bsdapp/eal/eal_memory.c
+++ b/lib/librte_eal/bsdapp/eal/eal_memory.c
@@ -45,6 +45,8 @@
 #include "eal_internal_cfg.h"
 #include "eal_filesystem.h"
 
+/* avoid re-defined against with freebsd header */
+#undef PAGE_SIZE
 #define PAGE_SIZE (sysconf(_SC_PAGESIZE))
 
 /*
diff --git a/lib/librte_eal/common/include/rte_lcore.h b/lib/librte_eal/common/include/rte_lcore.h
index 49b2c03..4c7d6bb 100644
--- a/lib/librte_eal/common/include/rte_lcore.h
+++ b/lib/librte_eal/common/include/rte_lcore.h
@@ -50,6 +50,13 @@ extern "C" {
 
 #define LCORE_ID_ANY -1    /**< Any lcore. */
 
+#if defined(__linux__)
+	typedef	cpu_set_t rte_cpuset_t;
+#elif defined(__FreeBSD__)
+#include <pthread_np.h>
+	typedef cpuset_t rte_cpuset_t;
+#endif
+
 /**
  * Structure storing internal configuration (per-lcore)
  */
@@ -65,6 +72,7 @@ struct lcore_config {
 	unsigned socket_id;        /**< physical socket id for this lcore */
 	unsigned core_id;          /**< core number on socket for this lcore */
 	int core_index;            /**< relative index, starting from 0 */
+	rte_cpuset_t cpuset;       /**< cpu set which the lcore affinity to */
 };
 
 /**
diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile
index 72ecf3a..0e9c447 100644
--- a/lib/librte_eal/linuxapp/eal/Makefile
+++ b/lib/librte_eal/linuxapp/eal/Makefile
@@ -87,6 +87,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_common_dev.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_common_options.c
 
 CFLAGS_eal.o := -D_GNU_SOURCE
+CFLAGS_eal_lcore.o := -D_GNU_SOURCE
 CFLAGS_eal_thread.o := -D_GNU_SOURCE
 CFLAGS_eal_log.o := -D_GNU_SOURCE
 CFLAGS_eal_common_log.o := -D_GNU_SOURCE
diff --git a/lib/librte_eal/linuxapp/eal/eal_lcore.c b/lib/librte_eal/linuxapp/eal/eal_lcore.c
index c67e0e6..29615f8 100644
--- a/lib/librte_eal/linuxapp/eal/eal_lcore.c
+++ b/lib/librte_eal/linuxapp/eal/eal_lcore.c
@@ -158,11 +158,19 @@ rte_eal_cpu_init(void)
 	 * ones and enable them by default.
 	 */
 	for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
+		/* init cpuset for per lcore config */
+		CPU_ZERO(&lcore_config[lcore_id].cpuset);
+
+		/* in 1:1 mapping, record related cpu detected state */
 		lcore_config[lcore_id].detected = cpu_detected(lcore_id);
 		if (lcore_config[lcore_id].detected == 0) {
 			config->lcore_role[lcore_id] = ROLE_OFF;
 			continue;
 		}
+
+		/* By default, lcore 1:1 map to cpu id */
+		CPU_SET(lcore_id, &lcore_config[lcore_id].cpuset);
+
 		/* By default, each detected core is enabled */
 		config->lcore_role[lcore_id] = ROLE_RTE;
 		lcore_config[lcore_id].core_id = cpu_core_id(lcore_id);
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v3 02/16] eal: new eal option '--lcores' for cpu assignment
  2015-01-29  0:24     ` [dpdk-dev] [PATCH v3 00/16] support multi-pthread per core Cunming Liang
  2015-01-29  0:24       ` [dpdk-dev] [PATCH v3 01/16] eal: add cpuset into per EAL thread lcore_config Cunming Liang
@ 2015-01-29  0:24       ` Cunming Liang
  2015-01-29  0:24       ` [dpdk-dev] [PATCH v3 03/16] eal: add support parsing socket_id from cpuset Cunming Liang
                         ` (14 subsequent siblings)
  16 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-29  0:24 UTC (permalink / raw)
  To: dev

It supports one new eal long option '--lcores' for EAL thread cpuset assignment.

The format pattern:
	--lcores='lcores[@cpus]<,lcores[@cpus]>'
lcores, cpus could be a single digit/range or a group.
'(' and ')' are necessary if it's a group.
If not supply '@cpus', the value of cpus uses the same as lcores.

e.g. '1,2@(5-7),(3-5)@(0,2),(0,6),7-8' means starting 9 EAL thread as below
  lcore 0 runs on cpuset 0x41 (cpu 0,6)
  lcore 1 runs on cpuset 0x2 (cpu 1)
  lcore 2 runs on cpuset 0xe0 (cpu 5,6,7)
  lcore 3,4,5 runs on cpuset 0x5 (cpu 0,2)
  lcore 6 runs on cpuset 0x41 (cpu 0,6)
  lcore 7 runs on cpuset 0x80 (cpu 7)
  lcore 8 runs on cpuset 0x100 (cpu 8)

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/common/eal_common_launch.c  |   1 -
 lib/librte_eal/common/eal_common_options.c | 300 ++++++++++++++++++++++++++++-
 lib/librte_eal/common/eal_options.h        |   2 +
 lib/librte_eal/linuxapp/eal/Makefile       |   1 +
 4 files changed, 299 insertions(+), 5 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_launch.c b/lib/librte_eal/common/eal_common_launch.c
index 599f83b..2d732b1 100644
--- a/lib/librte_eal/common/eal_common_launch.c
+++ b/lib/librte_eal/common/eal_common_launch.c
@@ -117,4 +117,3 @@ rte_eal_mp_wait_lcore(void)
 		rte_eal_wait_lcore(lcore_id);
 	}
 }
-
diff --git a/lib/librte_eal/common/eal_common_options.c b/lib/librte_eal/common/eal_common_options.c
index 67e02dc..29ebb6f 100644
--- a/lib/librte_eal/common/eal_common_options.c
+++ b/lib/librte_eal/common/eal_common_options.c
@@ -45,6 +45,7 @@
 #include <rte_lcore.h>
 #include <rte_version.h>
 #include <rte_devargs.h>
+#include <rte_memcpy.h>
 
 #include "eal_internal_cfg.h"
 #include "eal_options.h"
@@ -85,6 +86,7 @@ eal_long_options[] = {
 	{OPT_XEN_DOM0, 0, 0, OPT_XEN_DOM0_NUM},
 	{OPT_CREATE_UIO_DEV, 1, NULL, OPT_CREATE_UIO_DEV_NUM},
 	{OPT_VFIO_INTR, 1, NULL, OPT_VFIO_INTR_NUM},
+	{OPT_LCORES, 1, 0, OPT_LCORES_NUM},
 	{0, 0, 0, 0}
 };
 
@@ -255,9 +257,11 @@ eal_parse_corelist(const char *corelist)
 			if (min == RTE_MAX_LCORE)
 				min = idx;
 			for (idx = min; idx <= max; idx++) {
-				cfg->lcore_role[idx] = ROLE_RTE;
-				lcore_config[idx].core_index = count;
-				count++;
+				if (cfg->lcore_role[idx] != ROLE_RTE) {
+					cfg->lcore_role[idx] = ROLE_RTE;
+					lcore_config[idx].core_index = count;
+					count++;
+				}
 			}
 			min = RTE_MAX_LCORE;
 		} else
@@ -292,6 +296,279 @@ eal_parse_master_lcore(const char *arg)
 	return 0;
 }
 
+/*
+ * Parse elem, the elem could be single number/range or '(' ')' group
+ * Within group elem, '-' used for a range seperator;
+ *                    ',' used for a single number.
+ */
+static int
+eal_parse_set(const char *input, uint16_t set[], unsigned num)
+{
+	unsigned idx;
+	const char *str = input;
+	char *end = NULL;
+	unsigned min, max;
+
+	memset(set, 0, num * sizeof(uint16_t));
+
+	while (isblank(*str))
+		str++;
+
+	/* only digit or left bracket is qulify for start point */
+	if ((!isdigit(*str) && *str != '(') || *str == '\0')
+		return -1;
+
+	/* process single number or single range of number */
+	if (*str != '(') {
+		errno = 0;
+		idx = strtoul(str, &end, 10);
+		if (errno || end == NULL || idx >= num)
+			return -1;
+		else {
+			while (isblank(*end))
+				end++;
+
+			min = idx;
+			max = idx;
+			if (*end == '-') {
+				/* proccess single <number>-<number> */
+				end++;
+				while (isblank(*end))
+					end++;
+				if (!isdigit(*end))
+					return -1;
+
+				errno = 0;
+				idx = strtoul(end, &end, 10);
+				if (errno || end == NULL || idx >= num)
+					return -1;
+				max = idx;
+				while (isblank(*end))
+					end++;
+				if (*end != ',' && *end != '\0')
+					return -1;
+			}
+
+			if (*end != ',' && *end != '\0' &&
+			    *end != '@')
+				return -1;
+
+			for (idx = RTE_MIN(min, max);
+			     idx <= RTE_MAX(min, max); idx++)
+				set[idx] = 1;
+
+			return end - input;
+		}
+	}
+
+	/* process set within bracket */
+	str++;
+	while (isblank(*str))
+		str++;
+	if (*str == '\0')
+		return -1;
+
+	min = RTE_MAX_LCORE;
+	do {
+
+		/* go ahead to the first digit */
+		while (isblank(*str))
+			str++;
+		if (!isdigit(*str))
+			return -1;
+
+		/* get the digit value */
+		errno = 0;
+		idx = strtoul(str, &end, 10);
+		if (errno || end == NULL || idx >= num)
+			return -1;
+
+		/* go ahead to separator '-',',' and ')' */
+		while (isblank(*end))
+			end++;
+		if (*end == '-') {
+			if (min == RTE_MAX_LCORE)
+				min = idx;
+			else /* avoid continuous '-' */
+				return -1;
+		} else if ((*end == ',') || (*end == ')')) {
+			max = idx;
+			if (min == RTE_MAX_LCORE)
+				min = idx;
+			for (idx = RTE_MIN(min, max);
+			     idx <= RTE_MAX(min, max); idx++)
+				set[idx] = 1;
+
+			min = RTE_MAX_LCORE;
+		} else
+			return -1;
+
+		str = end + 1;
+	} while (*end != '\0' && *end != ')');
+
+	return str - input;
+}
+
+/* convert from set array to cpuset bitmap */
+static inline int
+convert_to_cpuset(rte_cpuset_t *cpusetp,
+	      uint16_t *set, unsigned num)
+{
+	unsigned idx;
+
+	CPU_ZERO(cpusetp);
+
+	for (idx = 0; idx < num; idx++) {
+		if (!set[idx])
+			continue;
+
+		if (!lcore_config[idx].detected) {
+			RTE_LOG(ERR, EAL, "core %u "
+				"unavailable\n", idx);
+			return -1;
+		}
+
+		CPU_SET(idx, cpusetp);
+	}
+
+	return 0;
+}
+
+/*
+ * The format pattern: --lcores='lcores[@cpus]<,lcores[@cpus]>'
+ * lcores, cpus could be a single digit/range or a group.
+ * '(' and ')' are necessary if it's a group.
+ * If not supply '@cpus', the value of cpus uses the same as lcores.
+ * e.g. '1,2@(5-7),(3-5)@(0,2),(0,6),7-8' means start 9 EAL thread as below
+ *   lcore 0 runs on cpuset 0x41 (cpu 0,6)
+ *   lcore 1 runs on cpuset 0x2 (cpu 1)
+ *   lcore 2 runs on cpuset 0xe0 (cpu 5,6,7)
+ *   lcore 3,4,5 runs on cpuset 0x5 (cpu 0,2)
+ *   lcore 6 runs on cpuset 0x41 (cpu 0,6)
+ *   lcore 7 runs on cpuset 0x80 (cpu 7)
+ *   lcore 8 runs on cpuset 0x100 (cpu 8)
+ */
+static int
+eal_parse_lcores(const char *lcores)
+{
+	struct rte_config *cfg = rte_eal_get_configuration();
+	static uint16_t set[RTE_MAX_LCORE];
+	unsigned idx = 0;
+	int i;
+	unsigned count = 0;
+	const char *lcore_start = NULL;
+	const char *end = NULL;
+	int offset;
+	rte_cpuset_t cpuset;
+	int lflags = 0;
+	int ret = -1;
+
+	if (lcores == NULL)
+		return -1;
+
+	/* Remove all blank characters ahead and after */
+	while (isblank(*lcores))
+		lcores++;
+	i = strnlen(lcores, sysconf(_SC_ARG_MAX));
+	while ((i > 0) && isblank(lcores[i - 1]))
+		i--;
+
+	CPU_ZERO(&cpuset);
+
+	/* Reset lcore config */
+	for (idx = 0; idx < RTE_MAX_LCORE; idx++) {
+		cfg->lcore_role[idx] = ROLE_OFF;
+		lcore_config[idx].core_index = -1;
+		CPU_ZERO(&lcore_config[idx].cpuset);
+	}
+
+	/* Get list of cores */
+	do {
+		while (isblank(*lcores))
+			lcores++;
+		if (*lcores == '\0')
+			goto err;
+
+		/* record lcore_set start point */
+		lcore_start = lcores;
+
+		/* go across a complete bracket */
+		if (*lcore_start == '(') {
+			lcores += strcspn(lcores, ")");
+			if (*lcores++ == '\0')
+				goto err;
+		}
+
+		/* scan the separator '@', ','(next) or '\0'(finish) */
+		lcores += strcspn(lcores, "@,");
+
+		if (*lcores == '@') {
+			/* explict assign cpu_set */
+			offset = eal_parse_set(lcores + 1, set, RTE_DIM(set));
+			if (offset < 0)
+				goto err;
+
+			/* prepare cpu_set and update the end cursor */
+			if (0 > convert_to_cpuset(&cpuset,
+						  set, RTE_DIM(set)))
+				goto err;
+			end = lcores + 1 + offset;
+		} else { /* ',' or '\0' */
+			/* haven't given cpu_set, current loop done */
+			end = lcores;
+
+			/* go back to check <number>-<number> */
+			offset = strcspn(lcore_start, "-");
+			if (offset < (end - lcore_start))
+				lflags = 1;
+		}
+
+		if (*end != ',' && *end != '\0')
+			goto err;
+
+		/* parse lcore_set from start point */
+		if (0 > eal_parse_set(lcore_start, set, RTE_DIM(set)))
+			goto err;
+
+		/* without '@', by default using lcore_set as cpu_set */
+		if (*lcores != '@' &&
+		    0 > convert_to_cpuset(&cpuset, set, RTE_DIM(set)))
+			goto err;
+
+		/* start to update lcore_set */
+		for (idx = 0; idx < RTE_MAX_LCORE; idx++) {
+			if (!set[idx])
+				continue;
+
+			if (cfg->lcore_role[idx] != ROLE_RTE) {
+				lcore_config[idx].core_index = count;
+				cfg->lcore_role[idx] = ROLE_RTE;
+				count++;
+			}
+
+			if (lflags) {
+				CPU_ZERO(&cpuset);
+				CPU_SET(idx, &cpuset);
+			}
+			rte_memcpy(&lcore_config[idx].cpuset, &cpuset,
+				   sizeof(rte_cpuset_t));
+		}
+
+		lcores = end + 1;
+	} while (*end != '\0');
+
+	if (count == 0)
+		goto err;
+
+	cfg->lcore_count = count;
+	lcores_parsed = 1;
+	ret = 0;
+
+err:
+
+	return ret;
+}
+
 static int
 eal_parse_syslog(const char *facility, struct internal_config *conf)
 {
@@ -492,6 +769,13 @@ eal_parse_common_option(int opt, const char *optarg,
 		conf->log_level = log;
 		break;
 	}
+	case OPT_LCORES_NUM:
+		if (eal_parse_lcores(optarg) < 0) {
+			RTE_LOG(ERR, EAL, "invalid parameter for --"
+				OPT_LCORES "\n");
+			return -1;
+		}
+		break;
 
 	/* don't know what to do, leave this to caller */
 	default:
@@ -530,7 +814,7 @@ eal_check_common_options(struct internal_config *internal_cfg)
 
 	if (!lcores_parsed) {
 		RTE_LOG(ERR, EAL, "CPU cores must be enabled with options "
-			"-c or -l\n");
+			"-c, -l or --lcores\n");
 		return -1;
 	}
 	if (cfg->lcore_role[cfg->master_lcore] != ROLE_RTE) {
@@ -586,6 +870,14 @@ eal_common_usage(void)
 	       "                 The argument format is <c1>[-c2][,c3[-c4],...]\n"
 	       "                 where c1, c2, etc are core indexes between 0 and %d\n"
 	       "  --"OPT_MASTER_LCORE" ID: Core ID that is used as master\n"
+	       "  --"OPT_LCORES" MAP: maps between lcore_set to phys_cpu_set\n"
+	       "                 The argument format is\n"
+	       "                       'lcores[@cpus]<,lcores[@cpus],...>'\n"
+	       "                 lcores and cpus list are grouped by '(' and ')'\n"
+	       "                 Within the group, '-' is used for range separator,\n"
+	       "                 ',' is used for single number separator.\n"
+	       "                 '( )' can be omitted for single element group, '@' \n"
+	       "                 can be omitted if cpus and lcores has the same value\n"
 	       "  -n NUM       : Number of memory channels\n"
 	       "  -v           : Display version information on startup\n"
 	       "  -m MB        : memory to allocate (see also --"OPT_SOCKET_MEM")\n"
diff --git a/lib/librte_eal/common/eal_options.h b/lib/librte_eal/common/eal_options.h
index e476f8d..a1cc59f 100644
--- a/lib/librte_eal/common/eal_options.h
+++ b/lib/librte_eal/common/eal_options.h
@@ -77,6 +77,8 @@ enum {
 	OPT_CREATE_UIO_DEV_NUM,
 #define OPT_VFIO_INTR    "vfio-intr"
 	OPT_VFIO_INTR_NUM,
+#define OPT_LCORES "lcores"
+	OPT_LCORES_NUM,
 	OPT_LONG_MAX_NUM
 };
 
diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile
index 0e9c447..025d836 100644
--- a/lib/librte_eal/linuxapp/eal/Makefile
+++ b/lib/librte_eal/linuxapp/eal/Makefile
@@ -95,6 +95,7 @@ CFLAGS_eal_hugepage_info.o := -D_GNU_SOURCE
 CFLAGS_eal_pci.o := -D_GNU_SOURCE
 CFLAGS_eal_pci_vfio.o := -D_GNU_SOURCE
 CFLAGS_eal_common_whitelist.o := -D_GNU_SOURCE
+CFLAGS_eal_common_options.o := -D_GNU_SOURCE
 
 # workaround for a gcc bug with noreturn attribute
 # http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v3 03/16] eal: add support parsing socket_id from cpuset
  2015-01-29  0:24     ` [dpdk-dev] [PATCH v3 00/16] support multi-pthread per core Cunming Liang
  2015-01-29  0:24       ` [dpdk-dev] [PATCH v3 01/16] eal: add cpuset into per EAL thread lcore_config Cunming Liang
  2015-01-29  0:24       ` [dpdk-dev] [PATCH v3 02/16] eal: new eal option '--lcores' for cpu assignment Cunming Liang
@ 2015-01-29  0:24       ` Cunming Liang
  2015-01-29  0:24       ` [dpdk-dev] [PATCH v3 04/16] eal: new TLS definition and API declaration Cunming Liang
                         ` (13 subsequent siblings)
  16 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-29  0:24 UTC (permalink / raw)
  To: dev

It returns the socket_id if all cpus in the cpuset belongs
to the same NUMA node, otherwise it will return SOCKET_ID_ANY.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/bsdapp/eal/eal_lcore.c   |  7 +++++
 lib/librte_eal/common/eal_thread.h      | 52 +++++++++++++++++++++++++++++++++
 lib/librte_eal/linuxapp/eal/eal_lcore.c |  7 +++++
 3 files changed, 66 insertions(+)

diff --git a/lib/librte_eal/bsdapp/eal/eal_lcore.c b/lib/librte_eal/bsdapp/eal/eal_lcore.c
index 72f8ac2..162fb4f 100644
--- a/lib/librte_eal/bsdapp/eal/eal_lcore.c
+++ b/lib/librte_eal/bsdapp/eal/eal_lcore.c
@@ -41,6 +41,7 @@
 #include <rte_debug.h>
 
 #include "eal_private.h"
+#include "eal_thread.h"
 
 /* No topology information available on FreeBSD including NUMA info */
 #define cpu_core_id(X) 0
@@ -112,3 +113,9 @@ rte_eal_cpu_init(void)
 
 	return 0;
 }
+
+unsigned
+eal_cpu_socket_id(__rte_unused unsigned cpu_id)
+{
+	return cpu_socket_id(cpu_id);
+}
diff --git a/lib/librte_eal/common/eal_thread.h b/lib/librte_eal/common/eal_thread.h
index b53b84d..a25ee86 100644
--- a/lib/librte_eal/common/eal_thread.h
+++ b/lib/librte_eal/common/eal_thread.h
@@ -34,6 +34,10 @@
 #ifndef EAL_THREAD_H
 #define EAL_THREAD_H
 
+#include <sched.h>
+
+#include <rte_debug.h>
+
 /**
  * basic loop of thread, called for each thread by eal_init().
  *
@@ -50,4 +54,52 @@ __attribute__((noreturn)) void *eal_thread_loop(void *arg);
  */
 void eal_thread_init_master(unsigned lcore_id);
 
+/**
+ * Get the NUMA socket id from cpu id.
+ * This function is private to EAL.
+ *
+ * @param cpu_id
+ *   The logical process id.
+ * @return
+ *   socket_id or SOCKET_ID_ANY
+ */
+unsigned eal_cpu_socket_id(unsigned cpu_id);
+
+/**
+ * Get the NUMA socket id from cpuset.
+ * This function is private to EAL.
+ *
+ * @param cpusetp
+ *   The point to a valid cpu set.
+ * @return
+ *   socket_id or SOCKET_ID_ANY
+ */
+static inline int
+eal_cpuset_socket_id(rte_cpuset_t *cpusetp)
+{
+	unsigned cpu = 0;
+	int socket_id = SOCKET_ID_ANY;
+	int sid;
+
+	if (cpusetp == NULL)
+		return SOCKET_ID_ANY;
+
+	do {
+		if (!CPU_ISSET(cpu, cpusetp))
+			continue;
+
+		if (socket_id == SOCKET_ID_ANY)
+			socket_id = eal_cpu_socket_id(cpu);
+
+		sid = eal_cpu_socket_id(cpu);
+		if (socket_id != sid) {
+			socket_id = SOCKET_ID_ANY;
+			break;
+		}
+
+	} while (++cpu < RTE_MAX_LCORE);
+
+	return socket_id;
+}
+
 #endif /* EAL_THREAD_H */
diff --git a/lib/librte_eal/linuxapp/eal/eal_lcore.c b/lib/librte_eal/linuxapp/eal/eal_lcore.c
index 29615f8..922af6d 100644
--- a/lib/librte_eal/linuxapp/eal/eal_lcore.c
+++ b/lib/librte_eal/linuxapp/eal/eal_lcore.c
@@ -45,6 +45,7 @@
 
 #include "eal_private.h"
 #include "eal_filesystem.h"
+#include "eal_thread.h"
 
 #define SYS_CPU_DIR "/sys/devices/system/cpu/cpu%u"
 #define CORE_ID_FILE "topology/core_id"
@@ -197,3 +198,9 @@ rte_eal_cpu_init(void)
 
 	return 0;
 }
+
+unsigned
+eal_cpu_socket_id(unsigned cpu_id)
+{
+	return cpu_socket_id(cpu_id);
+}
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v3 04/16] eal: new TLS definition and API declaration
  2015-01-29  0:24     ` [dpdk-dev] [PATCH v3 00/16] support multi-pthread per core Cunming Liang
                         ` (2 preceding siblings ...)
  2015-01-29  0:24       ` [dpdk-dev] [PATCH v3 03/16] eal: add support parsing socket_id from cpuset Cunming Liang
@ 2015-01-29  0:24       ` Cunming Liang
  2015-01-29  0:24       ` [dpdk-dev] [PATCH v3 05/16] eal: add eal_common_thread.c for common thread API Cunming Liang
                         ` (12 subsequent siblings)
  16 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-29  0:24 UTC (permalink / raw)
  To: dev

1. add two TLS *_socket_id* and *_cpuset*
2. add two external API rte_thread_set/get_affinity
3. add one internal API eal_thread_dump_affinity

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/bsdapp/eal/eal_thread.c    |  2 ++
 lib/librte_eal/common/eal_thread.h        | 14 ++++++++++++++
 lib/librte_eal/common/include/rte_lcore.h | 29 +++++++++++++++++++++++++++--
 lib/librte_eal/linuxapp/eal/eal_thread.c  |  2 ++
 4 files changed, 45 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c b/lib/librte_eal/bsdapp/eal/eal_thread.c
index ab05368..10220c7 100644
--- a/lib/librte_eal/bsdapp/eal/eal_thread.c
+++ b/lib/librte_eal/bsdapp/eal/eal_thread.c
@@ -56,6 +56,8 @@
 #include "eal_thread.h"
 
 RTE_DEFINE_PER_LCORE(unsigned, _lcore_id);
+RTE_DEFINE_PER_LCORE(unsigned, _socket_id);
+RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset);
 
 /*
  * Send a message to a slave lcore identified by slave_id to call a
diff --git a/lib/librte_eal/common/eal_thread.h b/lib/librte_eal/common/eal_thread.h
index a25ee86..28edf51 100644
--- a/lib/librte_eal/common/eal_thread.h
+++ b/lib/librte_eal/common/eal_thread.h
@@ -102,4 +102,18 @@ eal_cpuset_socket_id(rte_cpuset_t *cpusetp)
 	return socket_id;
 }
 
+/**
+ * Dump the current pthread cpuset.
+ * This function is private to EAL.
+ *
+ * @param str
+ *   The string buffer the cpuset will dump to.
+ * @param size
+ *   The string buffer size.
+ */
+#define CPU_STR_LEN            256
+void
+eal_thread_dump_affinity(char str[], unsigned size);
+
+
 #endif /* EAL_THREAD_H */
diff --git a/lib/librte_eal/common/include/rte_lcore.h b/lib/librte_eal/common/include/rte_lcore.h
index 4c7d6bb..facdbdc 100644
--- a/lib/librte_eal/common/include/rte_lcore.h
+++ b/lib/librte_eal/common/include/rte_lcore.h
@@ -43,6 +43,7 @@
 #include <rte_per_lcore.h>
 #include <rte_eal.h>
 #include <rte_launch.h>
+#include <rte_memory.h>
 
 #ifdef __cplusplus
 extern "C" {
@@ -80,7 +81,9 @@ struct lcore_config {
  */
 extern struct lcore_config lcore_config[RTE_MAX_LCORE];
 
-RTE_DECLARE_PER_LCORE(unsigned, _lcore_id); /**< Per core "core id". */
+RTE_DECLARE_PER_LCORE(unsigned, _lcore_id);  /**< Per thread "lcore id". */
+RTE_DECLARE_PER_LCORE(unsigned, _socket_id); /**< Per thread "socket id". */
+RTE_DECLARE_PER_LCORE(rte_cpuset_t, _cpuset); /**< Per thread "cpuset". */
 
 /**
  * Return the ID of the execution unit we are running on.
@@ -146,7 +149,7 @@ rte_lcore_index(int lcore_id)
 static inline unsigned
 rte_socket_id(void)
 {
-	return lcore_config[rte_lcore_id()].socket_id;
+	return RTE_PER_LCORE(_socket_id);
 }
 
 /**
@@ -229,6 +232,28 @@ rte_get_next_lcore(unsigned i, int skip_master, int wrap)
 	     i<RTE_MAX_LCORE;						\
 	     i = rte_get_next_lcore(i, 1, 0))
 
+/**
+ * Set core affinity of the current thread.
+ * Support both EAL and none-EAL thread and update TLS.
+ *
+ * @param cpusetp
+ *   Point to cpu_set_t for setting current thread affinity.
+ * @return
+ *   On success, return 0; otherwise return -1;
+ */
+int rte_thread_set_affinity(rte_cpuset_t *cpusetp);
+
+/**
+ * Get core affinity of the current thread.
+ *
+ * @param cpusetp
+ *   Point to cpu_set_t for getting current thread cpu affinity.
+ * @return
+ *   On success, return 0; otherwise return -1;
+ */
+int rte_thread_get_affinity(rte_cpuset_t *cpusetp);
+
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_eal/linuxapp/eal/eal_thread.c b/lib/librte_eal/linuxapp/eal/eal_thread.c
index 80a985f..748a83a 100644
--- a/lib/librte_eal/linuxapp/eal/eal_thread.c
+++ b/lib/librte_eal/linuxapp/eal/eal_thread.c
@@ -56,6 +56,8 @@
 #include "eal_thread.h"
 
 RTE_DEFINE_PER_LCORE(unsigned, _lcore_id);
+RTE_DEFINE_PER_LCORE(unsigned, _socket_id);
+RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset);
 
 /*
  * Send a message to a slave lcore identified by slave_id to call a
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v3 05/16] eal: add eal_common_thread.c for common thread API
  2015-01-29  0:24     ` [dpdk-dev] [PATCH v3 00/16] support multi-pthread per core Cunming Liang
                         ` (3 preceding siblings ...)
  2015-01-29  0:24       ` [dpdk-dev] [PATCH v3 04/16] eal: new TLS definition and API declaration Cunming Liang
@ 2015-01-29  0:24       ` Cunming Liang
  2015-01-29  0:24       ` [dpdk-dev] [PATCH v3 06/16] eal: add rte_gettid() to acquire unique system tid Cunming Liang
                         ` (11 subsequent siblings)
  16 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-29  0:24 UTC (permalink / raw)
  To: dev

The API works for both EAL thread and none EAL thread.
When calling rte_thread_set_affinity, the *_socket_id* and
*_cpuset* of calling thread will be updated if the thread
successful set the cpu affinity.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/bsdapp/eal/Makefile        |   1 +
 lib/librte_eal/common/eal_common_thread.c | 142 ++++++++++++++++++++++++++++++
 lib/librte_eal/linuxapp/eal/Makefile      |   2 +
 3 files changed, 145 insertions(+)
 create mode 100644 lib/librte_eal/common/eal_common_thread.c

diff --git a/lib/librte_eal/bsdapp/eal/Makefile b/lib/librte_eal/bsdapp/eal/Makefile
index d434882..78406be 100644
--- a/lib/librte_eal/bsdapp/eal/Makefile
+++ b/lib/librte_eal/bsdapp/eal/Makefile
@@ -73,6 +73,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_hexdump.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_devargs.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_dev.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_options.c
+SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_thread.c
 
 CFLAGS_eal.o := -D_GNU_SOURCE
 #CFLAGS_eal_thread.o := -D_GNU_SOURCE
diff --git a/lib/librte_eal/common/eal_common_thread.c b/lib/librte_eal/common/eal_common_thread.c
new file mode 100644
index 0000000..d996690
--- /dev/null
+++ b/lib/librte_eal/common/eal_common_thread.c
@@ -0,0 +1,142 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <unistd.h>
+#include <pthread.h>
+#include <sched.h>
+
+#include <rte_lcore.h>
+#include <rte_log.h>
+#include <rte_memcpy.h>
+
+#include "eal_thread.h"
+
+int
+rte_thread_set_affinity(rte_cpuset_t *cpusetp)
+{
+	int s;
+	unsigned lcore_id;
+	pthread_t tid;
+
+	if (!cpusetp)
+		return -1;
+
+	lcore_id = rte_lcore_id();
+	if (lcore_id != (unsigned)LCORE_ID_ANY) {
+		/* EAL thread */
+		tid = lcore_config[lcore_id].thread_id;
+
+		s = pthread_setaffinity_np(tid, sizeof(rte_cpuset_t), cpusetp);
+		if (s != 0) {
+			RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
+			return -1;
+		}
+
+		/* store socket_id in TLS for quick access */
+		RTE_PER_LCORE(_socket_id) =
+			eal_cpuset_socket_id(cpusetp);
+
+		/* store cpuset in TLS for quick access */
+		rte_memcpy(&RTE_PER_LCORE(_cpuset), cpusetp,
+			   sizeof(rte_cpuset_t));
+
+		/* update lcore_config */
+		lcore_config[lcore_id].socket_id = RTE_PER_LCORE(_socket_id);
+		rte_memcpy(&lcore_config[lcore_id].cpuset, cpusetp,
+			   sizeof(rte_cpuset_t));
+	} else {
+		/* none EAL thread */
+		tid = pthread_self();
+
+		s = pthread_setaffinity_np(tid, sizeof(rte_cpuset_t), cpusetp);
+		if (s != 0) {
+			RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
+			return -1;
+		}
+
+		/* store cpuset in TLS for quick access */
+		rte_memcpy(&RTE_PER_LCORE(_cpuset), cpusetp,
+			   sizeof(rte_cpuset_t));
+
+		/* store socket_id in TLS for quick access */
+		RTE_PER_LCORE(_socket_id) =
+			eal_cpuset_socket_id(cpusetp);
+	}
+
+	return 0;
+}
+
+int
+rte_thread_get_affinity(rte_cpuset_t *cpusetp)
+{
+	if (!cpusetp)
+		return -1;
+
+	rte_memcpy(cpusetp, &RTE_PER_LCORE(_cpuset),
+		   sizeof(rte_cpuset_t));
+
+	return 0;
+}
+
+void
+eal_thread_dump_affinity(char str[], unsigned size)
+{
+	rte_cpuset_t cpuset;
+	unsigned cpu;
+	int ret;
+	unsigned int out = 0;
+
+	if (rte_thread_get_affinity(&cpuset) < 0) {
+		str[0] = '\0';
+		return;
+	}
+
+	for (cpu = 0; cpu < RTE_MAX_LCORE; cpu++) {
+		if (!CPU_ISSET(cpu, &cpuset))
+			continue;
+
+		ret = snprintf(str + out,
+			       size - out, "%u,", cpu);
+		if (ret < 0 || (unsigned)ret >= size - out)
+			break;
+
+		out += ret;
+	}
+
+	/* remove the last separator */
+	if (out > 0)
+		str[out - 1] = '\0';
+}
diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile
index 025d836..07e21ca 100644
--- a/lib/librte_eal/linuxapp/eal/Makefile
+++ b/lib/librte_eal/linuxapp/eal/Makefile
@@ -85,6 +85,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_common_hexdump.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_common_devargs.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_common_dev.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_common_options.c
+SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_common_thread.c
 
 CFLAGS_eal.o := -D_GNU_SOURCE
 CFLAGS_eal_lcore.o := -D_GNU_SOURCE
@@ -96,6 +97,7 @@ CFLAGS_eal_pci.o := -D_GNU_SOURCE
 CFLAGS_eal_pci_vfio.o := -D_GNU_SOURCE
 CFLAGS_eal_common_whitelist.o := -D_GNU_SOURCE
 CFLAGS_eal_common_options.o := -D_GNU_SOURCE
+CFLAGS_eal_common_thread.o := -D_GNU_SOURCE
 
 # workaround for a gcc bug with noreturn attribute
 # http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v3 06/16] eal: add rte_gettid() to acquire unique system tid
  2015-01-29  0:24     ` [dpdk-dev] [PATCH v3 00/16] support multi-pthread per core Cunming Liang
                         ` (4 preceding siblings ...)
  2015-01-29  0:24       ` [dpdk-dev] [PATCH v3 05/16] eal: add eal_common_thread.c for common thread API Cunming Liang
@ 2015-01-29  0:24       ` Cunming Liang
  2015-01-29  0:24       ` [dpdk-dev] [PATCH v3 07/16] eal: apply affinity of EAL thread by assigned cpuset Cunming Liang
                         ` (10 subsequent siblings)
  16 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-29  0:24 UTC (permalink / raw)
  To: dev

The rte_gettid() wraps the linux and freebsd syscall gettid().
It provides a persistent unique thread id for the calling thread.
It will save the unique id in TLS on the first time.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/bsdapp/eal/eal_thread.c   |  9 +++++++++
 lib/librte_eal/common/include/rte_eal.h  | 27 +++++++++++++++++++++++++++
 lib/librte_eal/linuxapp/eal/eal_thread.c |  7 +++++++
 3 files changed, 43 insertions(+)

diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c b/lib/librte_eal/bsdapp/eal/eal_thread.c
index 10220c7..d0c077b 100644
--- a/lib/librte_eal/bsdapp/eal/eal_thread.c
+++ b/lib/librte_eal/bsdapp/eal/eal_thread.c
@@ -39,6 +39,7 @@
 #include <sched.h>
 #include <pthread_np.h>
 #include <sys/queue.h>
+#include <sys/thr.h>
 
 #include <rte_debug.h>
 #include <rte_atomic.h>
@@ -233,3 +234,11 @@ eal_thread_loop(__attribute__((unused)) void *arg)
 	/* pthread_exit(NULL); */
 	/* return NULL; */
 }
+
+/* require calling thread tid by gettid() */
+int rte_sys_gettid(void)
+{
+	long lwpid;
+	thr_self(&lwpid);
+	return (int)lwpid;
+}
diff --git a/lib/librte_eal/common/include/rte_eal.h b/lib/librte_eal/common/include/rte_eal.h
index f4ecd2e..8ccdd65 100644
--- a/lib/librte_eal/common/include/rte_eal.h
+++ b/lib/librte_eal/common/include/rte_eal.h
@@ -41,6 +41,9 @@
  */
 
 #include <stdint.h>
+#include <sched.h>
+
+#include <rte_per_lcore.h>
 
 #ifdef __cplusplus
 extern "C" {
@@ -262,6 +265,30 @@ rte_set_application_usage_hook( rte_usage_hook_t usage_func );
  */
 int rte_eal_has_hugepages(void);
 
+/**
+ * A wrap API for syscall gettid.
+ *
+ * @return
+ *   On success, returns the thread ID of calling process.
+ *   It always successful.
+ */
+int rte_sys_gettid(void);
+
+/**
+ * Get system unique thread id.
+ *
+ * @return
+ *   On success, returns the thread ID of calling process.
+ *   It always successful.
+ */
+static inline int rte_gettid(void)
+{
+	static RTE_DEFINE_PER_LCORE(int, _thread_id) = -1;
+	if (RTE_PER_LCORE(_thread_id) == -1)
+		RTE_PER_LCORE(_thread_id) = rte_sys_gettid();
+	return RTE_PER_LCORE(_thread_id);
+}
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_eal/linuxapp/eal/eal_thread.c b/lib/librte_eal/linuxapp/eal/eal_thread.c
index 748a83a..ed20c93 100644
--- a/lib/librte_eal/linuxapp/eal/eal_thread.c
+++ b/lib/librte_eal/linuxapp/eal/eal_thread.c
@@ -39,6 +39,7 @@
 #include <pthread.h>
 #include <sched.h>
 #include <sys/queue.h>
+#include <sys/syscall.h>
 
 #include <rte_debug.h>
 #include <rte_atomic.h>
@@ -233,3 +234,9 @@ eal_thread_loop(__attribute__((unused)) void *arg)
 	/* pthread_exit(NULL); */
 	/* return NULL; */
 }
+
+/* require calling thread tid by gettid() */
+int rte_sys_gettid(void)
+{
+	return (int)syscall(SYS_gettid);
+}
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v3 07/16] eal: apply affinity of EAL thread by assigned cpuset
  2015-01-29  0:24     ` [dpdk-dev] [PATCH v3 00/16] support multi-pthread per core Cunming Liang
                         ` (5 preceding siblings ...)
  2015-01-29  0:24       ` [dpdk-dev] [PATCH v3 06/16] eal: add rte_gettid() to acquire unique system tid Cunming Liang
@ 2015-01-29  0:24       ` Cunming Liang
  2015-01-29  0:24       ` [dpdk-dev] [PATCH v3 08/16] enic: fix re-define freebsd compile complain Cunming Liang
                         ` (9 subsequent siblings)
  16 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-29  0:24 UTC (permalink / raw)
  To: dev

EAL threads use assigned cpuset to set core affinity during startup.
It keeps 1:1 mapping, if no '--lcores' option is used.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/bsdapp/eal/eal.c          | 13 ++++---
 lib/librte_eal/bsdapp/eal/eal_thread.c   | 63 +++++++++---------------------
 lib/librte_eal/linuxapp/eal/eal.c        |  7 +++-
 lib/librte_eal/linuxapp/eal/eal_thread.c | 67 +++++++++++---------------------
 4 files changed, 54 insertions(+), 96 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
index 69f3c03..98c5a83 100644
--- a/lib/librte_eal/bsdapp/eal/eal.c
+++ b/lib/librte_eal/bsdapp/eal/eal.c
@@ -432,6 +432,7 @@ rte_eal_init(int argc, char **argv)
 	int i, fctret, ret;
 	pthread_t thread_id;
 	static rte_atomic32_t run_once = RTE_ATOMIC32_INIT(0);
+	char cpuset[CPU_STR_LEN];
 
 	if (!rte_atomic32_test_and_set(&run_once))
 		return -1;
@@ -502,13 +503,17 @@ rte_eal_init(int argc, char **argv)
 	if (rte_eal_pci_init() < 0)
 		rte_panic("Cannot init PCI\n");
 
-	RTE_LOG(DEBUG, EAL, "Master core %u is ready (tid=%p)\n",
-		rte_config.master_lcore, thread_id);
-
 	eal_check_mem_on_local_socket();
 
 	rte_eal_mcfg_complete();
 
+	eal_thread_init_master(rte_config.master_lcore);
+
+	eal_thread_dump_affinity(cpuset, CPU_STR_LEN);
+
+	RTE_LOG(DEBUG, EAL, "Master lcore %u is ready (tid=%p;cpuset=[%s])\n",
+		rte_config.master_lcore, thread_id, cpuset);
+
 	if (rte_eal_dev_init() < 0)
 		rte_panic("Cannot init pmd devices\n");
 
@@ -532,8 +537,6 @@ rte_eal_init(int argc, char **argv)
 			rte_panic("Cannot create thread\n");
 	}
 
-	eal_thread_init_master(rte_config.master_lcore);
-
 	/*
 	 * Launch a dummy function on all slave lcores, so that master lcore
 	 * knows they are all ready when this function returns.
diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c b/lib/librte_eal/bsdapp/eal/eal_thread.c
index d0c077b..5b16302 100644
--- a/lib/librte_eal/bsdapp/eal/eal_thread.c
+++ b/lib/librte_eal/bsdapp/eal/eal_thread.c
@@ -103,55 +103,27 @@ eal_thread_set_affinity(void)
 {
 	int s;
 	pthread_t thread;
-
-/*
- * According to the section VERSIONS of the CPU_ALLOC man page:
- *
- * The CPU_ZERO(), CPU_SET(), CPU_CLR(), and CPU_ISSET() macros were added
- * in glibc 2.3.3.
- *
- * CPU_COUNT() first appeared in glibc 2.6.
- *
- * CPU_AND(),     CPU_OR(),     CPU_XOR(),    CPU_EQUAL(),    CPU_ALLOC(),
- * CPU_ALLOC_SIZE(), CPU_FREE(), CPU_ZERO_S(),  CPU_SET_S(),  CPU_CLR_S(),
- * CPU_ISSET_S(),  CPU_AND_S(), CPU_OR_S(), CPU_XOR_S(), and CPU_EQUAL_S()
- * first appeared in glibc 2.7.
- */
-#if defined(CPU_ALLOC)
-	size_t size;
-	cpu_set_t *cpusetp;
-
-	cpusetp = CPU_ALLOC(RTE_MAX_LCORE);
-	if (cpusetp == NULL) {
-		RTE_LOG(ERR, EAL, "CPU_ALLOC failed\n");
-		return -1;
-	}
-
-	size = CPU_ALLOC_SIZE(RTE_MAX_LCORE);
-	CPU_ZERO_S(size, cpusetp);
-	CPU_SET_S(rte_lcore_id(), size, cpusetp);
+	unsigned lcore_id = rte_lcore_id();
 
 	thread = pthread_self();
-	s = pthread_setaffinity_np(thread, size, cpusetp);
+	s = pthread_setaffinity_np(thread, sizeof(cpuset_t),
+				   &lcore_config[lcore_id].cpuset);
 	if (s != 0) {
 		RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
-		CPU_FREE(cpusetp);
 		return -1;
 	}
 
-	CPU_FREE(cpusetp);
-#else /* CPU_ALLOC */
-	cpuset_t cpuset;
-	CPU_ZERO( &cpuset );
-	CPU_SET( rte_lcore_id(), &cpuset );
+	/* acquire system unique id  */
+	rte_gettid();
+
+	/* store socket_id in TLS for quick access */
+	RTE_PER_LCORE(_socket_id) =
+		eal_cpuset_socket_id(&lcore_config[lcore_id].cpuset);
+
+	CPU_COPY(&lcore_config[lcore_id].cpuset, &RTE_PER_LCORE(_cpuset));
+
+	lcore_config[lcore_id].socket_id = RTE_PER_LCORE(_socket_id);
 
-	thread = pthread_self();
-	s = pthread_setaffinity_np(thread, sizeof( cpuset ), &cpuset);
-	if (s != 0) {
-		RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
-		return -1;
-	}
-#endif
 	return 0;
 }
 
@@ -174,6 +146,7 @@ eal_thread_loop(__attribute__((unused)) void *arg)
 	unsigned lcore_id;
 	pthread_t thread_id;
 	int m2s, s2m;
+	char cpuset[CPU_STR_LEN];
 
 	thread_id = pthread_self();
 
@@ -185,9 +158,6 @@ eal_thread_loop(__attribute__((unused)) void *arg)
 	if (lcore_id == RTE_MAX_LCORE)
 		rte_panic("cannot retrieve lcore id\n");
 
-	RTE_LOG(DEBUG, EAL, "Core %u is ready (tid=%p)\n",
-		lcore_id, thread_id);
-
 	m2s = lcore_config[lcore_id].pipe_master2slave[0];
 	s2m = lcore_config[lcore_id].pipe_slave2master[1];
 
@@ -198,6 +168,11 @@ eal_thread_loop(__attribute__((unused)) void *arg)
 	if (eal_thread_set_affinity() < 0)
 		rte_panic("cannot set affinity\n");
 
+	eal_thread_dump_affinity(cpuset, CPU_STR_LEN);
+
+	RTE_LOG(DEBUG, EAL, "lcore %u is ready (tid=%p;cpuset=[%s])\n",
+		lcore_id, thread_id, cpuset);
+
 	/* read on our pipe to get commands */
 	while (1) {
 		void *fct_arg;
diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index f99e158..c95adec 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -702,6 +702,7 @@ rte_eal_init(int argc, char **argv)
 	static rte_atomic32_t run_once = RTE_ATOMIC32_INIT(0);
 	struct shared_driver *solib = NULL;
 	const char *logid;
+	char cpuset[CPU_STR_LEN];
 
 	if (!rte_atomic32_test_and_set(&run_once))
 		return -1;
@@ -802,8 +803,10 @@ rte_eal_init(int argc, char **argv)
 
 	eal_thread_init_master(rte_config.master_lcore);
 
-	RTE_LOG(DEBUG, EAL, "Master core %u is ready (tid=%x)\n",
-		rte_config.master_lcore, (int)thread_id);
+	eal_thread_dump_affinity(cpuset, CPU_STR_LEN);
+
+	RTE_LOG(DEBUG, EAL, "Master lcore %u is ready (tid=%x;cpuset=[%s])\n",
+		rte_config.master_lcore, (int)thread_id, cpuset);
 
 	if (rte_eal_dev_init() < 0)
 		rte_panic("Cannot init pmd devices\n");
diff --git a/lib/librte_eal/linuxapp/eal/eal_thread.c b/lib/librte_eal/linuxapp/eal/eal_thread.c
index ed20c93..6eb1525 100644
--- a/lib/librte_eal/linuxapp/eal/eal_thread.c
+++ b/lib/librte_eal/linuxapp/eal/eal_thread.c
@@ -52,6 +52,7 @@
 #include <rte_eal.h>
 #include <rte_per_lcore.h>
 #include <rte_lcore.h>
+#include <rte_memcpy.h>
 
 #include "eal_private.h"
 #include "eal_thread.h"
@@ -97,61 +98,34 @@ rte_eal_remote_launch(int (*f)(void *), void *arg, unsigned slave_id)
 	return 0;
 }
 
-/* set affinity for current thread */
+/* set affinity for current EAL thread */
 static int
 eal_thread_set_affinity(void)
 {
 	int s;
 	pthread_t thread;
-
-/*
- * According to the section VERSIONS of the CPU_ALLOC man page:
- *
- * The CPU_ZERO(), CPU_SET(), CPU_CLR(), and CPU_ISSET() macros were added
- * in glibc 2.3.3.
- *
- * CPU_COUNT() first appeared in glibc 2.6.
- *
- * CPU_AND(),     CPU_OR(),     CPU_XOR(),    CPU_EQUAL(),    CPU_ALLOC(),
- * CPU_ALLOC_SIZE(), CPU_FREE(), CPU_ZERO_S(),  CPU_SET_S(),  CPU_CLR_S(),
- * CPU_ISSET_S(),  CPU_AND_S(), CPU_OR_S(), CPU_XOR_S(), and CPU_EQUAL_S()
- * first appeared in glibc 2.7.
- */
-#if defined(CPU_ALLOC)
-	size_t size;
-	cpu_set_t *cpusetp;
-
-	cpusetp = CPU_ALLOC(RTE_MAX_LCORE);
-	if (cpusetp == NULL) {
-		RTE_LOG(ERR, EAL, "CPU_ALLOC failed\n");
-		return -1;
-	}
-
-	size = CPU_ALLOC_SIZE(RTE_MAX_LCORE);
-	CPU_ZERO_S(size, cpusetp);
-	CPU_SET_S(rte_lcore_id(), size, cpusetp);
+	unsigned lcore_id = rte_lcore_id();
 
 	thread = pthread_self();
-	s = pthread_setaffinity_np(thread, size, cpusetp);
+	s = pthread_setaffinity_np(thread, sizeof(cpu_set_t),
+				   &lcore_config[lcore_id].cpuset);
 	if (s != 0) {
 		RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
-		CPU_FREE(cpusetp);
 		return -1;
 	}
 
-	CPU_FREE(cpusetp);
-#else /* CPU_ALLOC */
-	cpu_set_t cpuset;
-	CPU_ZERO( &cpuset );
-	CPU_SET( rte_lcore_id(), &cpuset );
+	/* acquire system unique id  */
+	rte_gettid();
+
+	/* store socket_id in TLS for quick access */
+	RTE_PER_LCORE(_socket_id) =
+		eal_cpuset_socket_id(&lcore_config[lcore_id].cpuset);
+
+	rte_memcpy(&RTE_PER_LCORE(_cpuset),
+		   &lcore_config[lcore_id].cpuset, sizeof(rte_cpuset_t));
+
+	lcore_config[lcore_id].socket_id = RTE_PER_LCORE(_socket_id);
 
-	thread = pthread_self();
-	s = pthread_setaffinity_np(thread, sizeof( cpuset ), &cpuset);
-	if (s != 0) {
-		RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
-		return -1;
-	}
-#endif
 	return 0;
 }
 
@@ -174,6 +148,7 @@ eal_thread_loop(__attribute__((unused)) void *arg)
 	unsigned lcore_id;
 	pthread_t thread_id;
 	int m2s, s2m;
+	char cpuset[CPU_STR_LEN];
 
 	thread_id = pthread_self();
 
@@ -185,9 +160,6 @@ eal_thread_loop(__attribute__((unused)) void *arg)
 	if (lcore_id == RTE_MAX_LCORE)
 		rte_panic("cannot retrieve lcore id\n");
 
-	RTE_LOG(DEBUG, EAL, "Core %u is ready (tid=%x)\n",
-		lcore_id, (int)thread_id);
-
 	m2s = lcore_config[lcore_id].pipe_master2slave[0];
 	s2m = lcore_config[lcore_id].pipe_slave2master[1];
 
@@ -198,6 +170,11 @@ eal_thread_loop(__attribute__((unused)) void *arg)
 	if (eal_thread_set_affinity() < 0)
 		rte_panic("cannot set affinity\n");
 
+	eal_thread_dump_affinity(cpuset, CPU_STR_LEN);
+
+	RTE_LOG(DEBUG, EAL, "lcore %u is ready (tid=%x;cpuset=[%s])\n",
+		lcore_id, (int)thread_id, cpuset);
+
 	/* read on our pipe to get commands */
 	while (1) {
 		void *fct_arg;
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v3 08/16] enic: fix re-define freebsd compile complain
  2015-01-29  0:24     ` [dpdk-dev] [PATCH v3 00/16] support multi-pthread per core Cunming Liang
                         ` (6 preceding siblings ...)
  2015-01-29  0:24       ` [dpdk-dev] [PATCH v3 07/16] eal: apply affinity of EAL thread by assigned cpuset Cunming Liang
@ 2015-01-29  0:24       ` Cunming Liang
  2015-01-29  0:24       ` [dpdk-dev] [PATCH v3 09/16] malloc: fix the issue of SOCKET_ID_ANY Cunming Liang
                         ` (8 subsequent siblings)
  16 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-29  0:24 UTC (permalink / raw)
  To: dev

Some macro already been defined by freebsd 'sys/param.h'.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_pmd_enic/enic.h        | 1 +
 lib/librte_pmd_enic/enic_compat.h | 1 +
 2 files changed, 2 insertions(+)

diff --git a/lib/librte_pmd_enic/enic.h b/lib/librte_pmd_enic/enic.h
index c43417c..189c3b9 100644
--- a/lib/librte_pmd_enic/enic.h
+++ b/lib/librte_pmd_enic/enic.h
@@ -66,6 +66,7 @@
 #define ENIC_CALC_IP_CKSUM      1
 #define ENIC_CALC_TCP_UDP_CKSUM 2
 #define ENIC_MAX_MTU            9000
+#undef PAGE_SIZE
 #define PAGE_SIZE               4096
 #define PAGE_ROUND_UP(x) \
 	((((unsigned long)(x)) + PAGE_SIZE-1) & (~(PAGE_SIZE-1)))
diff --git a/lib/librte_pmd_enic/enic_compat.h b/lib/librte_pmd_enic/enic_compat.h
index b1af838..b84c766 100644
--- a/lib/librte_pmd_enic/enic_compat.h
+++ b/lib/librte_pmd_enic/enic_compat.h
@@ -67,6 +67,7 @@
 #define pr_warn(y, args...) dev_warning(0, y, ##args)
 #define BUG() pr_err("BUG at %s:%d", __func__, __LINE__)
 
+#undef ALIGN
 #define ALIGN(x, a)              __ALIGN_MASK(x, (typeof(x))(a)-1)
 #define __ALIGN_MASK(x, mask)    (((x)+(mask))&~(mask))
 #define udelay usleep
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v3 09/16] malloc: fix the issue of SOCKET_ID_ANY
  2015-01-29  0:24     ` [dpdk-dev] [PATCH v3 00/16] support multi-pthread per core Cunming Liang
                         ` (7 preceding siblings ...)
  2015-01-29  0:24       ` [dpdk-dev] [PATCH v3 08/16] enic: fix re-define freebsd compile complain Cunming Liang
@ 2015-01-29  0:24       ` Cunming Liang
  2015-01-29  0:24       ` [dpdk-dev] [PATCH v3 10/16] log: fix the gap to support non-EAL thread Cunming Liang
                         ` (7 subsequent siblings)
  16 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-29  0:24 UTC (permalink / raw)
  To: dev

Add check for rte_socket_id(), avoid get unexpected return like (-1).

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_malloc/malloc_heap.h | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/lib/librte_malloc/malloc_heap.h b/lib/librte_malloc/malloc_heap.h
index b4aec45..a47136d 100644
--- a/lib/librte_malloc/malloc_heap.h
+++ b/lib/librte_malloc/malloc_heap.h
@@ -44,7 +44,12 @@ extern "C" {
 static inline unsigned
 malloc_get_numa_socket(void)
 {
-	return rte_socket_id();
+	unsigned socket_id = rte_socket_id();
+
+	if (socket_id == (unsigned)SOCKET_ID_ANY)
+		return 0;
+
+	return socket_id;
 }
 
 void *
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v3 10/16] log: fix the gap to support non-EAL thread
  2015-01-29  0:24     ` [dpdk-dev] [PATCH v3 00/16] support multi-pthread per core Cunming Liang
                         ` (8 preceding siblings ...)
  2015-01-29  0:24       ` [dpdk-dev] [PATCH v3 09/16] malloc: fix the issue of SOCKET_ID_ANY Cunming Liang
@ 2015-01-29  0:24       ` Cunming Liang
  2015-01-29  0:24       ` [dpdk-dev] [PATCH v3 11/16] eal: set _lcore_id and _socket_id to (-1) by default Cunming Liang
                         ` (6 subsequent siblings)
  16 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-29  0:24 UTC (permalink / raw)
  To: dev

For those non-EAL thread, *_lcore_id* is invalid and probably larger than RTE_MAX_LCORE.
The patch adds the check and allows only EAL thread using EAL per thread log level and log type.
Others shares the global log level.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/common/eal_common_log.c  | 17 +++++++++++++++--
 lib/librte_eal/common/include/rte_log.h |  5 +++++
 2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_log.c b/lib/librte_eal/common/eal_common_log.c
index cf57619..e8dc94a 100644
--- a/lib/librte_eal/common/eal_common_log.c
+++ b/lib/librte_eal/common/eal_common_log.c
@@ -193,11 +193,20 @@ rte_set_log_type(uint32_t type, int enable)
 		rte_logs.type &= (~type);
 }
 
+/* Get global log type */
+uint32_t
+rte_get_log_type(void)
+{
+	return rte_logs.type;
+}
+
 /* get the current loglevel for the message beeing processed */
 int rte_log_cur_msg_loglevel(void)
 {
 	unsigned lcore_id;
 	lcore_id = rte_lcore_id();
+	if (lcore_id >= RTE_MAX_LCORE)
+		return rte_get_log_level();
 	return log_cur_msg[lcore_id].loglevel;
 }
 
@@ -206,6 +215,8 @@ int rte_log_cur_msg_logtype(void)
 {
 	unsigned lcore_id;
 	lcore_id = rte_lcore_id();
+	if (lcore_id >= RTE_MAX_LCORE)
+		return rte_get_log_type();
 	return log_cur_msg[lcore_id].logtype;
 }
 
@@ -265,8 +276,10 @@ rte_vlog(__attribute__((unused)) uint32_t level,
 
 	/* save loglevel and logtype in a global per-lcore variable */
 	lcore_id = rte_lcore_id();
-	log_cur_msg[lcore_id].loglevel = level;
-	log_cur_msg[lcore_id].logtype = logtype;
+	if (lcore_id < RTE_MAX_LCORE) {
+		log_cur_msg[lcore_id].loglevel = level;
+		log_cur_msg[lcore_id].logtype = logtype;
+	}
 
 	ret = vfprintf(f, format, ap);
 	fflush(f);
diff --git a/lib/librte_eal/common/include/rte_log.h b/lib/librte_eal/common/include/rte_log.h
index db1ea08..f83a0d9 100644
--- a/lib/librte_eal/common/include/rte_log.h
+++ b/lib/librte_eal/common/include/rte_log.h
@@ -144,6 +144,11 @@ uint32_t rte_get_log_level(void);
 void rte_set_log_type(uint32_t type, int enable);
 
 /**
+ * Get the global log type.
+ */
+uint32_t rte_get_log_type(void);
+
+/**
  * Get the current loglevel for the message being processed.
  *
  * Before calling the user-defined stream for logging, the log
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v3 11/16] eal: set _lcore_id and _socket_id to (-1) by default
  2015-01-29  0:24     ` [dpdk-dev] [PATCH v3 00/16] support multi-pthread per core Cunming Liang
                         ` (9 preceding siblings ...)
  2015-01-29  0:24       ` [dpdk-dev] [PATCH v3 10/16] log: fix the gap to support non-EAL thread Cunming Liang
@ 2015-01-29  0:24       ` Cunming Liang
  2015-01-29  0:24       ` [dpdk-dev] [PATCH v3 12/16] eal: fix recursive spinlock in non-EAL thraed Cunming Liang
                         ` (5 subsequent siblings)
  16 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-29  0:24 UTC (permalink / raw)
  To: dev

For those none EAL thread, *_lcore_id* shall always be LCORE_ID_ANY.
The libraries using *_lcore_id* as index need to take care.
*_socket_id* always be SOCKET_ID_ANY unitl the thread changes the affinity
by rte_thread_set_affinity()

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/bsdapp/eal/eal_thread.c   | 4 ++--
 lib/librte_eal/linuxapp/eal/eal_thread.c | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c b/lib/librte_eal/bsdapp/eal/eal_thread.c
index 5b16302..2b3c9a8 100644
--- a/lib/librte_eal/bsdapp/eal/eal_thread.c
+++ b/lib/librte_eal/bsdapp/eal/eal_thread.c
@@ -56,8 +56,8 @@
 #include "eal_private.h"
 #include "eal_thread.h"
 
-RTE_DEFINE_PER_LCORE(unsigned, _lcore_id);
-RTE_DEFINE_PER_LCORE(unsigned, _socket_id);
+RTE_DEFINE_PER_LCORE(unsigned, _lcore_id) = (unsigned)LCORE_ID_ANY;
+RTE_DEFINE_PER_LCORE(unsigned, _socket_id) = (unsigned)SOCKET_ID_ANY;
 RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset);
 
 /*
diff --git a/lib/librte_eal/linuxapp/eal/eal_thread.c b/lib/librte_eal/linuxapp/eal/eal_thread.c
index 6eb1525..ab94e20 100644
--- a/lib/librte_eal/linuxapp/eal/eal_thread.c
+++ b/lib/librte_eal/linuxapp/eal/eal_thread.c
@@ -57,8 +57,8 @@
 #include "eal_private.h"
 #include "eal_thread.h"
 
-RTE_DEFINE_PER_LCORE(unsigned, _lcore_id);
-RTE_DEFINE_PER_LCORE(unsigned, _socket_id);
+RTE_DEFINE_PER_LCORE(unsigned, _lcore_id) = (unsigned)LCORE_ID_ANY;
+RTE_DEFINE_PER_LCORE(unsigned, _socket_id) = (unsigned)SOCKET_ID_ANY;
 RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset);
 
 /*
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v3 12/16] eal: fix recursive spinlock in non-EAL thraed
  2015-01-29  0:24     ` [dpdk-dev] [PATCH v3 00/16] support multi-pthread per core Cunming Liang
                         ` (10 preceding siblings ...)
  2015-01-29  0:24       ` [dpdk-dev] [PATCH v3 11/16] eal: set _lcore_id and _socket_id to (-1) by default Cunming Liang
@ 2015-01-29  0:24       ` Cunming Liang
  2015-01-29  0:24       ` [dpdk-dev] [PATCH v3 13/16] mempool: add support to non-EAL thread Cunming Liang
                         ` (4 subsequent siblings)
  16 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-29  0:24 UTC (permalink / raw)
  To: dev

In non-EAL thread, lcore_id alrways be LCORE_ID_ANY.
It cann't be used as unique id for recursive spinlock.
Then use rte_gettid() to replace it.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/common/include/generic/rte_spinlock.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/include/generic/rte_spinlock.h b/lib/librte_eal/common/include/generic/rte_spinlock.h
index dea885c..c7fb0df 100644
--- a/lib/librte_eal/common/include/generic/rte_spinlock.h
+++ b/lib/librte_eal/common/include/generic/rte_spinlock.h
@@ -179,7 +179,7 @@ static inline void rte_spinlock_recursive_init(rte_spinlock_recursive_t *slr)
  */
 static inline void rte_spinlock_recursive_lock(rte_spinlock_recursive_t *slr)
 {
-	int id = rte_lcore_id();
+	int id = rte_gettid();
 
 	if (slr->user != id) {
 		rte_spinlock_lock(&slr->sl);
@@ -212,7 +212,7 @@ static inline void rte_spinlock_recursive_unlock(rte_spinlock_recursive_t *slr)
  */
 static inline int rte_spinlock_recursive_trylock(rte_spinlock_recursive_t *slr)
 {
-	int id = rte_lcore_id();
+	int id = rte_gettid();
 
 	if (slr->user != id) {
 		if (rte_spinlock_trylock(&slr->sl) == 0)
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v3 13/16] mempool: add support to non-EAL thread
  2015-01-29  0:24     ` [dpdk-dev] [PATCH v3 00/16] support multi-pthread per core Cunming Liang
                         ` (11 preceding siblings ...)
  2015-01-29  0:24       ` [dpdk-dev] [PATCH v3 12/16] eal: fix recursive spinlock in non-EAL thraed Cunming Liang
@ 2015-01-29  0:24       ` Cunming Liang
  2015-01-29  0:24       ` [dpdk-dev] [PATCH v3 14/16] ring: " Cunming Liang
                         ` (3 subsequent siblings)
  16 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-29  0:24 UTC (permalink / raw)
  To: dev

For non-EAL thread, bypass per lcore cache, directly use ring pool.
It allows using rte_mempool in either EAL thread or any user pthread.
As in non-EAL thread, it directly rely on rte_ring and it's none preemptive.
It doesn't suggest to run multi-pthread/cpu which compete the rte_mempool.
It will get bad performance and has critical risk if scheduling policy is RT.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_mempool/rte_mempool.h | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 3314651..4845f27 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -198,10 +198,12 @@ struct rte_mempool {
  *   Number to add to the object-oriented statistics.
  */
 #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
-#define __MEMPOOL_STAT_ADD(mp, name, n) do {			\
-		unsigned __lcore_id = rte_lcore_id();		\
-		mp->stats[__lcore_id].name##_objs += n;		\
-		mp->stats[__lcore_id].name##_bulk += 1;		\
+#define __MEMPOOL_STAT_ADD(mp, name, n) do {                    \
+		unsigned __lcore_id = rte_lcore_id();           \
+		if (__lcore_id < RTE_MAX_LCORE) {               \
+			mp->stats[__lcore_id].name##_objs += n;	\
+			mp->stats[__lcore_id].name##_bulk += 1;	\
+		}                                               \
 	} while(0)
 #else
 #define __MEMPOOL_STAT_ADD(mp, name, n) do {} while(0)
@@ -767,8 +769,9 @@ __mempool_put_bulk(struct rte_mempool *mp, void * const *obj_table,
 	__MEMPOOL_STAT_ADD(mp, put, n);
 
 #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0
-	/* cache is not enabled or single producer */
-	if (unlikely(cache_size == 0 || is_mp == 0))
+	/* cache is not enabled or single producer or none EAL thread */
+	if (unlikely(cache_size == 0 || is_mp == 0 ||
+		     lcore_id >= RTE_MAX_LCORE))
 		goto ring_enqueue;
 
 	/* Go straight to ring if put would overflow mem allocated for cache */
@@ -952,7 +955,8 @@ __mempool_get_bulk(struct rte_mempool *mp, void **obj_table,
 	uint32_t cache_size = mp->cache_size;
 
 	/* cache is not enabled or single consumer */
-	if (unlikely(cache_size == 0 || is_mc == 0 || n >= cache_size))
+	if (unlikely(cache_size == 0 || is_mc == 0 ||
+		     n >= cache_size || lcore_id >= RTE_MAX_LCORE))
 		goto ring_dequeue;
 
 	cache = &mp->local_cache[lcore_id];
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v3 14/16] ring: add support to non-EAL thread
  2015-01-29  0:24     ` [dpdk-dev] [PATCH v3 00/16] support multi-pthread per core Cunming Liang
                         ` (12 preceding siblings ...)
  2015-01-29  0:24       ` [dpdk-dev] [PATCH v3 13/16] mempool: add support to non-EAL thread Cunming Liang
@ 2015-01-29  0:24       ` Cunming Liang
  2015-01-29  0:24       ` [dpdk-dev] [PATCH v3 15/16] ring: add sched_yield to avoid spin forever Cunming Liang
                         ` (2 subsequent siblings)
  16 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-29  0:24 UTC (permalink / raw)
  To: dev

ring debug stat won't take care non-EAL thread.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_ring/rte_ring.h | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index 7cd5f2d..39bacdd 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -188,10 +188,12 @@ struct rte_ring {
  *   The number to add to the object-oriented statistics.
  */
 #ifdef RTE_LIBRTE_RING_DEBUG
-#define __RING_STAT_ADD(r, name, n) do {		\
-		unsigned __lcore_id = rte_lcore_id();	\
-		r->stats[__lcore_id].name##_objs += n;	\
-		r->stats[__lcore_id].name##_bulk += 1;	\
+#define __RING_STAT_ADD(r, name, n) do {                        \
+		unsigned __lcore_id = rte_lcore_id();           \
+		if (__lcore_id < RTE_MAX_LCORE) {               \
+			r->stats[__lcore_id].name##_objs += n;  \
+			r->stats[__lcore_id].name##_bulk += 1;  \
+		}                                               \
 	} while(0)
 #else
 #define __RING_STAT_ADD(r, name, n) do {} while(0)
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v3 15/16] ring: add sched_yield to avoid spin forever
  2015-01-29  0:24     ` [dpdk-dev] [PATCH v3 00/16] support multi-pthread per core Cunming Liang
                         ` (13 preceding siblings ...)
  2015-01-29  0:24       ` [dpdk-dev] [PATCH v3 14/16] ring: " Cunming Liang
@ 2015-01-29  0:24       ` Cunming Liang
  2015-01-29  0:24       ` [dpdk-dev] [PATCH v3 16/16] timer: add support to non-EAL thread Cunming Liang
  2015-02-02  2:02       ` [dpdk-dev] [PATCH v4 00/17] support multi-pthread per core Cunming Liang
  16 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-29  0:24 UTC (permalink / raw)
  To: dev

It does a gentle yield after spin for a while.
It reduces the wasting by spin when the preemption happens.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_ring/rte_ring.h | 25 +++++++++++++++++++------
 1 file changed, 19 insertions(+), 6 deletions(-)

diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index 39bacdd..c16da6e 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -126,6 +126,7 @@ struct rte_ring_debug_stats {
 
 #define RTE_RING_NAMESIZE 32 /**< The maximum length of a ring name. */
 #define RTE_RING_MZ_PREFIX "RG_"
+#define RTE_RING_PAUSE_REP 0x100  /**< yield after num of times pause. */
 
 /**
  * An RTE ring structure.
@@ -410,7 +411,7 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	uint32_t cons_tail, free_entries;
 	const unsigned max = n;
 	int success;
-	unsigned i;
+	unsigned i, rep;
 	uint32_t mask = r->prod.mask;
 	int ret;
 
@@ -468,8 +469,14 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	 * If there are other enqueues in progress that preceded us,
 	 * we need to wait for them to complete
 	 */
-	while (unlikely(r->prod.tail != prod_head))
-		rte_pause();
+	do {
+		for (rep = RTE_RING_PAUSE_REP;
+		     rep != 0 && r->prod.tail != prod_head; rep--)
+			rte_pause();
+
+		if (rep == 0)
+			sched_yield();
+	}while(rep == 0);
 
 	r->prod.tail = prod_next;
 	return ret;
@@ -589,7 +596,7 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 	uint32_t cons_next, entries;
 	const unsigned max = n;
 	int success;
-	unsigned i;
+	unsigned i, rep;
 	uint32_t mask = r->prod.mask;
 
 	/* move cons.head atomically */
@@ -634,8 +641,14 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 	 * If there are other dequeues in progress that preceded us,
 	 * we need to wait for them to complete
 	 */
-	while (unlikely(r->cons.tail != cons_head))
-		rte_pause();
+	do {
+		for (rep = RTE_RING_PAUSE_REP;
+		     rep != 0 && r->cons.tail != cons_head; rep--)
+			rte_pause();
+
+		if (rep == 0)
+			sched_yield();
+	}while(rep == 0);
 
 	__RING_STAT_ADD(r, deq_success, n);
 	r->cons.tail = cons_next;
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v3 16/16] timer: add support to non-EAL thread
  2015-01-29  0:24     ` [dpdk-dev] [PATCH v3 00/16] support multi-pthread per core Cunming Liang
                         ` (14 preceding siblings ...)
  2015-01-29  0:24       ` [dpdk-dev] [PATCH v3 15/16] ring: add sched_yield to avoid spin forever Cunming Liang
@ 2015-01-29  0:24       ` Cunming Liang
  2015-02-02  2:02       ` [dpdk-dev] [PATCH v4 00/17] support multi-pthread per core Cunming Liang
  16 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-01-29  0:24 UTC (permalink / raw)
  To: dev

Allow to setup timers only for EAL (lcore) threads (__lcore_id < MAX_LCORE_ID).
E.g. – dynamically created thread will be able to reset/stop timer for lcore thread,
but it will be not allowed to setup timer for itself or another non-lcore thread.
rte_timer_manage() for non-lcore thread would simply do nothing and return straightway.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_timer/rte_timer.c | 40 +++++++++++++++++++++++++++++++---------
 lib/librte_timer/rte_timer.h |  2 +-
 2 files changed, 32 insertions(+), 10 deletions(-)

diff --git a/lib/librte_timer/rte_timer.c b/lib/librte_timer/rte_timer.c
index 269a992..601c159 100644
--- a/lib/librte_timer/rte_timer.c
+++ b/lib/librte_timer/rte_timer.c
@@ -79,9 +79,10 @@ static struct priv_timer priv_timer[RTE_MAX_LCORE];
 
 /* when debug is enabled, store some statistics */
 #ifdef RTE_LIBRTE_TIMER_DEBUG
-#define __TIMER_STAT_ADD(name, n) do {				\
-		unsigned __lcore_id = rte_lcore_id();		\
-		priv_timer[__lcore_id].stats.name += (n);	\
+#define __TIMER_STAT_ADD(name, n) do {					\
+		unsigned __lcore_id = rte_lcore_id();			\
+		if (__lcore_id < RTE_MAX_LCORE)				\
+			priv_timer[__lcore_id].stats.name += (n);	\
 	} while(0)
 #else
 #define __TIMER_STAT_ADD(name, n) do {} while(0)
@@ -127,15 +128,26 @@ timer_set_config_state(struct rte_timer *tim,
 	unsigned lcore_id;
 
 	lcore_id = rte_lcore_id();
+	if (lcore_id >= RTE_MAX_LCORE)
+		lcore_id = LCORE_ID_ANY;
 
 	/* wait that the timer is in correct status before update,
 	 * and mark it as being configured */
 	while (success == 0) {
 		prev_status.u32 = tim->status.u32;
 
+		/*
+		 * prevent race condition of non-EAL threads
+		 * to update the timer. When 'owner == LCORE_ID_ANY',
+		 * it means updated by a non-EAL thread.
+		 */
+		if (lcore_id == (unsigned)LCORE_ID_ANY &&
+		    (uint16_t)lcore_id == prev_status.owner)
+			return -1;
+
 		/* timer is running on another core, exit */
 		if (prev_status.state == RTE_TIMER_RUNNING &&
-		    (unsigned)prev_status.owner != lcore_id)
+		    prev_status.owner != (uint16_t)lcore_id)
 			return -1;
 
 		/* timer is being configured on another core */
@@ -366,9 +378,13 @@ __rte_timer_reset(struct rte_timer *tim, uint64_t expire,
 
 	/* round robin for tim_lcore */
 	if (tim_lcore == (unsigned)LCORE_ID_ANY) {
-		tim_lcore = rte_get_next_lcore(priv_timer[lcore_id].prev_lcore,
-					       0, 1);
-		priv_timer[lcore_id].prev_lcore = tim_lcore;
+		if (lcore_id < RTE_MAX_LCORE) {
+			tim_lcore = rte_get_next_lcore(
+				priv_timer[lcore_id].prev_lcore,
+				0, 1);
+			priv_timer[lcore_id].prev_lcore = tim_lcore;
+		} else
+			tim_lcore = rte_get_next_lcore(LCORE_ID_ANY, 0, 1);
 	}
 
 	/* wait that the timer is in correct status before update,
@@ -378,7 +394,8 @@ __rte_timer_reset(struct rte_timer *tim, uint64_t expire,
 		return -1;
 
 	__TIMER_STAT_ADD(reset, 1);
-	if (prev_status.state == RTE_TIMER_RUNNING) {
+	if (prev_status.state == RTE_TIMER_RUNNING &&
+	    lcore_id < RTE_MAX_LCORE) {
 		priv_timer[lcore_id].updated = 1;
 	}
 
@@ -455,7 +472,8 @@ rte_timer_stop(struct rte_timer *tim)
 		return -1;
 
 	__TIMER_STAT_ADD(stop, 1);
-	if (prev_status.state == RTE_TIMER_RUNNING) {
+	if (prev_status.state == RTE_TIMER_RUNNING &&
+	    lcore_id < RTE_MAX_LCORE) {
 		priv_timer[lcore_id].updated = 1;
 	}
 
@@ -499,6 +517,10 @@ void rte_timer_manage(void)
 	uint64_t cur_time;
 	int i, ret;
 
+	/* timer manager only runs on EAL thread */
+	if (lcore_id >= RTE_MAX_LCORE)
+		return;
+
 	__TIMER_STAT_ADD(manage, 1);
 	/* optimize for the case where per-cpu list is empty */
 	if (priv_timer[lcore_id].pending_head.sl_next[0] == NULL)
diff --git a/lib/librte_timer/rte_timer.h b/lib/librte_timer/rte_timer.h
index 4907cf5..5c5df91 100644
--- a/lib/librte_timer/rte_timer.h
+++ b/lib/librte_timer/rte_timer.h
@@ -76,7 +76,7 @@ extern "C" {
 #define RTE_TIMER_RUNNING 2 /**< State: timer function is running. */
 #define RTE_TIMER_CONFIG  3 /**< State: timer is being configured. */
 
-#define RTE_TIMER_NO_OWNER -1 /**< Timer has no owner. */
+#define RTE_TIMER_NO_OWNER -2 /**< Timer has no owner. */
 
 /**
  * Timer type: Periodic or single (one-shot).
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v4 00/17] support multi-pthread per core
  2015-01-29  0:24     ` [dpdk-dev] [PATCH v3 00/16] support multi-pthread per core Cunming Liang
                         ` (15 preceding siblings ...)
  2015-01-29  0:24       ` [dpdk-dev] [PATCH v3 16/16] timer: add support to non-EAL thread Cunming Liang
@ 2015-02-02  2:02       ` Cunming Liang
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 01/17] eal: add cpuset into per EAL thread lcore_config Cunming Liang
                           ` (18 more replies)
  16 siblings, 19 replies; 253+ messages in thread
From: Cunming Liang @ 2015-02-02  2:02 UTC (permalink / raw)
  To: dev

v4 changes:
  new patch fixing strnlen() invalid return in 32bit icc [03/17]
  update and add more comments on sched_yield() [16/17]

v3 changes:
  new patch adding sched_yield() in rte_ring to avoid long spin [16/17]

v2 changes:
  add '<number>-<number>' support for EAL option '--lcores' [02/17]

The patch series contain the enhancements of EAL and fixes for libraries
to run multi-pthreads(either EAL or non-EAL thread) per physical core.
Two major changes list as below:
- Extend the core affinity of each EAL thread to 1:n.
  Each lcore stands for a EAL thread rather than a logical core.
  The change adds new EAL option to allow static lcore to cpuset assginment.
  Then a lcore(EAL thread) affinity to a cpuset, original 1:1 mapping is the special case.
- Fix the libraries to allow running on any non-EAL thread.
  It fix the gaps running libraries in non-EAL thread(dynamic created by user).
  Each fix libraries take care the case of rte_lcore_id() >= RTE_MAX_LCORE.

Thanks a million for the comments from Konstantin, Bruce, Mirek and Stephen in RFC review.


*** BLURB HERE ***

Cunming Liang (17):
  eal: add cpuset into per EAL thread lcore_config
  eal: new eal option '--lcores' for cpu assignment
  eal: fix wrong strnlen() return value in 32bit icc
  eal: add support parsing socket_id from cpuset
  eal: new TLS definition and API declaration
  eal: add eal_common_thread.c for common thread API
  eal: add rte_gettid() to acquire unique system tid
  eal: apply affinity of EAL thread by assigned cpuset
  enic: fix re-define freebsd compile complain
  malloc: fix the issue of SOCKET_ID_ANY
  log: fix the gap to support non-EAL thread
  eal: set _lcore_id and _socket_id to (-1) by default
  eal: fix recursive spinlock in non-EAL thraed
  mempool: add support to non-EAL thread
  ring: add support to non-EAL thread
  ring: add sched_yield to avoid spin forever
  timer: add support to non-EAL thread

 lib/librte_eal/bsdapp/eal/Makefile                 |   1 +
 lib/librte_eal/bsdapp/eal/eal.c                    |  13 +-
 lib/librte_eal/bsdapp/eal/eal_lcore.c              |  14 +
 lib/librte_eal/bsdapp/eal/eal_memory.c             |   2 +
 lib/librte_eal/bsdapp/eal/eal_thread.c             |  76 +++---
 lib/librte_eal/common/eal_common_launch.c          |   1 -
 lib/librte_eal/common/eal_common_log.c             |  17 +-
 lib/librte_eal/common/eal_common_options.c         | 302 ++++++++++++++++++++-
 lib/librte_eal/common/eal_common_thread.c          | 142 ++++++++++
 lib/librte_eal/common/eal_options.h                |   2 +
 lib/librte_eal/common/eal_thread.h                 |  66 +++++
 .../common/include/generic/rte_spinlock.h          |   4 +-
 lib/librte_eal/common/include/rte_eal.h            |  27 ++
 lib/librte_eal/common/include/rte_lcore.h          |  37 ++-
 lib/librte_eal/common/include/rte_log.h            |   5 +
 lib/librte_eal/linuxapp/eal/Makefile               |   4 +
 lib/librte_eal/linuxapp/eal/eal.c                  |   7 +-
 lib/librte_eal/linuxapp/eal/eal_lcore.c            |  15 +
 lib/librte_eal/linuxapp/eal/eal_thread.c           |  78 +++---
 lib/librte_malloc/malloc_heap.h                    |   7 +-
 lib/librte_mempool/rte_mempool.h                   |  18 +-
 lib/librte_pmd_enic/enic.h                         |   1 +
 lib/librte_pmd_enic/enic_compat.h                  |   1 +
 lib/librte_ring/rte_ring.h                         |  45 ++-
 lib/librte_timer/rte_timer.c                       |  40 ++-
 lib/librte_timer/rte_timer.h                       |   2 +-
 26 files changed, 789 insertions(+), 138 deletions(-)
 create mode 100644 lib/librte_eal/common/eal_common_thread.c

-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v4 01/17] eal: add cpuset into per EAL thread lcore_config
  2015-02-02  2:02       ` [dpdk-dev] [PATCH v4 00/17] support multi-pthread per core Cunming Liang
@ 2015-02-02  2:02         ` Cunming Liang
  2015-02-08 19:59           ` Olivier MATZ
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 02/17] eal: new eal option '--lcores' for cpu assignment Cunming Liang
                           ` (17 subsequent siblings)
  18 siblings, 1 reply; 253+ messages in thread
From: Cunming Liang @ 2015-02-02  2:02 UTC (permalink / raw)
  To: dev

The patch adds 'cpuset' into per-lcore configure 'lcore_config[]',
as the lcore no longer always 1:1 pinning with physical cpu.
The lcore now stands for a EAL thread rather than a logical cpu.

It doesn't change the default behavior of 1:1 mapping, but allows to
affinity the EAL thread to multiple cpus.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/bsdapp/eal/eal_lcore.c     | 7 +++++++
 lib/librte_eal/bsdapp/eal/eal_memory.c    | 2 ++
 lib/librte_eal/common/include/rte_lcore.h | 8 ++++++++
 lib/librte_eal/linuxapp/eal/Makefile      | 1 +
 lib/librte_eal/linuxapp/eal/eal_lcore.c   | 8 ++++++++
 5 files changed, 26 insertions(+)

diff --git a/lib/librte_eal/bsdapp/eal/eal_lcore.c b/lib/librte_eal/bsdapp/eal/eal_lcore.c
index 662f024..72f8ac2 100644
--- a/lib/librte_eal/bsdapp/eal/eal_lcore.c
+++ b/lib/librte_eal/bsdapp/eal/eal_lcore.c
@@ -76,11 +76,18 @@ rte_eal_cpu_init(void)
 	 * ones and enable them by default.
 	 */
 	for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
+		/* init cpuset for per lcore config */
+		CPU_ZERO(&lcore_config[lcore_id].cpuset);
+
 		lcore_config[lcore_id].detected = (lcore_id < ncpus);
 		if (lcore_config[lcore_id].detected == 0) {
 			config->lcore_role[lcore_id] = ROLE_OFF;
 			continue;
 		}
+
+		/* By default, lcore 1:1 map to cpu id */
+		CPU_SET(lcore_id, &lcore_config[lcore_id].cpuset);
+
 		/* By default, each detected core is enabled */
 		config->lcore_role[lcore_id] = ROLE_RTE;
 		lcore_config[lcore_id].core_id = cpu_core_id(lcore_id);
diff --git a/lib/librte_eal/bsdapp/eal/eal_memory.c b/lib/librte_eal/bsdapp/eal/eal_memory.c
index 65ee87d..a34d500 100644
--- a/lib/librte_eal/bsdapp/eal/eal_memory.c
+++ b/lib/librte_eal/bsdapp/eal/eal_memory.c
@@ -45,6 +45,8 @@
 #include "eal_internal_cfg.h"
 #include "eal_filesystem.h"
 
+/* avoid re-defined against with freebsd header */
+#undef PAGE_SIZE
 #define PAGE_SIZE (sysconf(_SC_PAGESIZE))
 
 /*
diff --git a/lib/librte_eal/common/include/rte_lcore.h b/lib/librte_eal/common/include/rte_lcore.h
index 49b2c03..4c7d6bb 100644
--- a/lib/librte_eal/common/include/rte_lcore.h
+++ b/lib/librte_eal/common/include/rte_lcore.h
@@ -50,6 +50,13 @@ extern "C" {
 
 #define LCORE_ID_ANY -1    /**< Any lcore. */
 
+#if defined(__linux__)
+	typedef	cpu_set_t rte_cpuset_t;
+#elif defined(__FreeBSD__)
+#include <pthread_np.h>
+	typedef cpuset_t rte_cpuset_t;
+#endif
+
 /**
  * Structure storing internal configuration (per-lcore)
  */
@@ -65,6 +72,7 @@ struct lcore_config {
 	unsigned socket_id;        /**< physical socket id for this lcore */
 	unsigned core_id;          /**< core number on socket for this lcore */
 	int core_index;            /**< relative index, starting from 0 */
+	rte_cpuset_t cpuset;       /**< cpu set which the lcore affinity to */
 };
 
 /**
diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile
index 72ecf3a..0e9c447 100644
--- a/lib/librte_eal/linuxapp/eal/Makefile
+++ b/lib/librte_eal/linuxapp/eal/Makefile
@@ -87,6 +87,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_common_dev.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_common_options.c
 
 CFLAGS_eal.o := -D_GNU_SOURCE
+CFLAGS_eal_lcore.o := -D_GNU_SOURCE
 CFLAGS_eal_thread.o := -D_GNU_SOURCE
 CFLAGS_eal_log.o := -D_GNU_SOURCE
 CFLAGS_eal_common_log.o := -D_GNU_SOURCE
diff --git a/lib/librte_eal/linuxapp/eal/eal_lcore.c b/lib/librte_eal/linuxapp/eal/eal_lcore.c
index c67e0e6..29615f8 100644
--- a/lib/librte_eal/linuxapp/eal/eal_lcore.c
+++ b/lib/librte_eal/linuxapp/eal/eal_lcore.c
@@ -158,11 +158,19 @@ rte_eal_cpu_init(void)
 	 * ones and enable them by default.
 	 */
 	for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
+		/* init cpuset for per lcore config */
+		CPU_ZERO(&lcore_config[lcore_id].cpuset);
+
+		/* in 1:1 mapping, record related cpu detected state */
 		lcore_config[lcore_id].detected = cpu_detected(lcore_id);
 		if (lcore_config[lcore_id].detected == 0) {
 			config->lcore_role[lcore_id] = ROLE_OFF;
 			continue;
 		}
+
+		/* By default, lcore 1:1 map to cpu id */
+		CPU_SET(lcore_id, &lcore_config[lcore_id].cpuset);
+
 		/* By default, each detected core is enabled */
 		config->lcore_role[lcore_id] = ROLE_RTE;
 		lcore_config[lcore_id].core_id = cpu_core_id(lcore_id);
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v4 02/17] eal: new eal option '--lcores' for cpu assignment
  2015-02-02  2:02       ` [dpdk-dev] [PATCH v4 00/17] support multi-pthread per core Cunming Liang
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 01/17] eal: add cpuset into per EAL thread lcore_config Cunming Liang
@ 2015-02-02  2:02         ` Cunming Liang
  2015-02-08 19:59           ` Olivier MATZ
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 03/17] eal: fix wrong strnlen() return value in 32bit icc Cunming Liang
                           ` (16 subsequent siblings)
  18 siblings, 1 reply; 253+ messages in thread
From: Cunming Liang @ 2015-02-02  2:02 UTC (permalink / raw)
  To: dev

It supports one new eal long option '--lcores' for EAL thread cpuset assignment.

The format pattern:
	--lcores='lcores[@cpus]<,lcores[@cpus]>'
lcores, cpus could be a single digit/range or a group.
'(' and ')' are necessary if it's a group.
If not supply '@cpus', the value of cpus uses the same as lcores.

e.g. '1,2@(5-7),(3-5)@(0,2),(0,6),7-8' means starting 9 EAL thread as below
  lcore 0 runs on cpuset 0x41 (cpu 0,6)
  lcore 1 runs on cpuset 0x2 (cpu 1)
  lcore 2 runs on cpuset 0xe0 (cpu 5,6,7)
  lcore 3,4,5 runs on cpuset 0x5 (cpu 0,2)
  lcore 6 runs on cpuset 0x41 (cpu 0,6)
  lcore 7 runs on cpuset 0x80 (cpu 7)
  lcore 8 runs on cpuset 0x100 (cpu 8)

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/common/eal_common_launch.c  |   1 -
 lib/librte_eal/common/eal_common_options.c | 300 ++++++++++++++++++++++++++++-
 lib/librte_eal/common/eal_options.h        |   2 +
 lib/librte_eal/linuxapp/eal/Makefile       |   1 +
 4 files changed, 299 insertions(+), 5 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_launch.c b/lib/librte_eal/common/eal_common_launch.c
index 599f83b..2d732b1 100644
--- a/lib/librte_eal/common/eal_common_launch.c
+++ b/lib/librte_eal/common/eal_common_launch.c
@@ -117,4 +117,3 @@ rte_eal_mp_wait_lcore(void)
 		rte_eal_wait_lcore(lcore_id);
 	}
 }
-
diff --git a/lib/librte_eal/common/eal_common_options.c b/lib/librte_eal/common/eal_common_options.c
index 67e02dc..29ebb6f 100644
--- a/lib/librte_eal/common/eal_common_options.c
+++ b/lib/librte_eal/common/eal_common_options.c
@@ -45,6 +45,7 @@
 #include <rte_lcore.h>
 #include <rte_version.h>
 #include <rte_devargs.h>
+#include <rte_memcpy.h>
 
 #include "eal_internal_cfg.h"
 #include "eal_options.h"
@@ -85,6 +86,7 @@ eal_long_options[] = {
 	{OPT_XEN_DOM0, 0, 0, OPT_XEN_DOM0_NUM},
 	{OPT_CREATE_UIO_DEV, 1, NULL, OPT_CREATE_UIO_DEV_NUM},
 	{OPT_VFIO_INTR, 1, NULL, OPT_VFIO_INTR_NUM},
+	{OPT_LCORES, 1, 0, OPT_LCORES_NUM},
 	{0, 0, 0, 0}
 };
 
@@ -255,9 +257,11 @@ eal_parse_corelist(const char *corelist)
 			if (min == RTE_MAX_LCORE)
 				min = idx;
 			for (idx = min; idx <= max; idx++) {
-				cfg->lcore_role[idx] = ROLE_RTE;
-				lcore_config[idx].core_index = count;
-				count++;
+				if (cfg->lcore_role[idx] != ROLE_RTE) {
+					cfg->lcore_role[idx] = ROLE_RTE;
+					lcore_config[idx].core_index = count;
+					count++;
+				}
 			}
 			min = RTE_MAX_LCORE;
 		} else
@@ -292,6 +296,279 @@ eal_parse_master_lcore(const char *arg)
 	return 0;
 }
 
+/*
+ * Parse elem, the elem could be single number/range or '(' ')' group
+ * Within group elem, '-' used for a range seperator;
+ *                    ',' used for a single number.
+ */
+static int
+eal_parse_set(const char *input, uint16_t set[], unsigned num)
+{
+	unsigned idx;
+	const char *str = input;
+	char *end = NULL;
+	unsigned min, max;
+
+	memset(set, 0, num * sizeof(uint16_t));
+
+	while (isblank(*str))
+		str++;
+
+	/* only digit or left bracket is qulify for start point */
+	if ((!isdigit(*str) && *str != '(') || *str == '\0')
+		return -1;
+
+	/* process single number or single range of number */
+	if (*str != '(') {
+		errno = 0;
+		idx = strtoul(str, &end, 10);
+		if (errno || end == NULL || idx >= num)
+			return -1;
+		else {
+			while (isblank(*end))
+				end++;
+
+			min = idx;
+			max = idx;
+			if (*end == '-') {
+				/* proccess single <number>-<number> */
+				end++;
+				while (isblank(*end))
+					end++;
+				if (!isdigit(*end))
+					return -1;
+
+				errno = 0;
+				idx = strtoul(end, &end, 10);
+				if (errno || end == NULL || idx >= num)
+					return -1;
+				max = idx;
+				while (isblank(*end))
+					end++;
+				if (*end != ',' && *end != '\0')
+					return -1;
+			}
+
+			if (*end != ',' && *end != '\0' &&
+			    *end != '@')
+				return -1;
+
+			for (idx = RTE_MIN(min, max);
+			     idx <= RTE_MAX(min, max); idx++)
+				set[idx] = 1;
+
+			return end - input;
+		}
+	}
+
+	/* process set within bracket */
+	str++;
+	while (isblank(*str))
+		str++;
+	if (*str == '\0')
+		return -1;
+
+	min = RTE_MAX_LCORE;
+	do {
+
+		/* go ahead to the first digit */
+		while (isblank(*str))
+			str++;
+		if (!isdigit(*str))
+			return -1;
+
+		/* get the digit value */
+		errno = 0;
+		idx = strtoul(str, &end, 10);
+		if (errno || end == NULL || idx >= num)
+			return -1;
+
+		/* go ahead to separator '-',',' and ')' */
+		while (isblank(*end))
+			end++;
+		if (*end == '-') {
+			if (min == RTE_MAX_LCORE)
+				min = idx;
+			else /* avoid continuous '-' */
+				return -1;
+		} else if ((*end == ',') || (*end == ')')) {
+			max = idx;
+			if (min == RTE_MAX_LCORE)
+				min = idx;
+			for (idx = RTE_MIN(min, max);
+			     idx <= RTE_MAX(min, max); idx++)
+				set[idx] = 1;
+
+			min = RTE_MAX_LCORE;
+		} else
+			return -1;
+
+		str = end + 1;
+	} while (*end != '\0' && *end != ')');
+
+	return str - input;
+}
+
+/* convert from set array to cpuset bitmap */
+static inline int
+convert_to_cpuset(rte_cpuset_t *cpusetp,
+	      uint16_t *set, unsigned num)
+{
+	unsigned idx;
+
+	CPU_ZERO(cpusetp);
+
+	for (idx = 0; idx < num; idx++) {
+		if (!set[idx])
+			continue;
+
+		if (!lcore_config[idx].detected) {
+			RTE_LOG(ERR, EAL, "core %u "
+				"unavailable\n", idx);
+			return -1;
+		}
+
+		CPU_SET(idx, cpusetp);
+	}
+
+	return 0;
+}
+
+/*
+ * The format pattern: --lcores='lcores[@cpus]<,lcores[@cpus]>'
+ * lcores, cpus could be a single digit/range or a group.
+ * '(' and ')' are necessary if it's a group.
+ * If not supply '@cpus', the value of cpus uses the same as lcores.
+ * e.g. '1,2@(5-7),(3-5)@(0,2),(0,6),7-8' means start 9 EAL thread as below
+ *   lcore 0 runs on cpuset 0x41 (cpu 0,6)
+ *   lcore 1 runs on cpuset 0x2 (cpu 1)
+ *   lcore 2 runs on cpuset 0xe0 (cpu 5,6,7)
+ *   lcore 3,4,5 runs on cpuset 0x5 (cpu 0,2)
+ *   lcore 6 runs on cpuset 0x41 (cpu 0,6)
+ *   lcore 7 runs on cpuset 0x80 (cpu 7)
+ *   lcore 8 runs on cpuset 0x100 (cpu 8)
+ */
+static int
+eal_parse_lcores(const char *lcores)
+{
+	struct rte_config *cfg = rte_eal_get_configuration();
+	static uint16_t set[RTE_MAX_LCORE];
+	unsigned idx = 0;
+	int i;
+	unsigned count = 0;
+	const char *lcore_start = NULL;
+	const char *end = NULL;
+	int offset;
+	rte_cpuset_t cpuset;
+	int lflags = 0;
+	int ret = -1;
+
+	if (lcores == NULL)
+		return -1;
+
+	/* Remove all blank characters ahead and after */
+	while (isblank(*lcores))
+		lcores++;
+	i = strnlen(lcores, sysconf(_SC_ARG_MAX));
+	while ((i > 0) && isblank(lcores[i - 1]))
+		i--;
+
+	CPU_ZERO(&cpuset);
+
+	/* Reset lcore config */
+	for (idx = 0; idx < RTE_MAX_LCORE; idx++) {
+		cfg->lcore_role[idx] = ROLE_OFF;
+		lcore_config[idx].core_index = -1;
+		CPU_ZERO(&lcore_config[idx].cpuset);
+	}
+
+	/* Get list of cores */
+	do {
+		while (isblank(*lcores))
+			lcores++;
+		if (*lcores == '\0')
+			goto err;
+
+		/* record lcore_set start point */
+		lcore_start = lcores;
+
+		/* go across a complete bracket */
+		if (*lcore_start == '(') {
+			lcores += strcspn(lcores, ")");
+			if (*lcores++ == '\0')
+				goto err;
+		}
+
+		/* scan the separator '@', ','(next) or '\0'(finish) */
+		lcores += strcspn(lcores, "@,");
+
+		if (*lcores == '@') {
+			/* explict assign cpu_set */
+			offset = eal_parse_set(lcores + 1, set, RTE_DIM(set));
+			if (offset < 0)
+				goto err;
+
+			/* prepare cpu_set and update the end cursor */
+			if (0 > convert_to_cpuset(&cpuset,
+						  set, RTE_DIM(set)))
+				goto err;
+			end = lcores + 1 + offset;
+		} else { /* ',' or '\0' */
+			/* haven't given cpu_set, current loop done */
+			end = lcores;
+
+			/* go back to check <number>-<number> */
+			offset = strcspn(lcore_start, "-");
+			if (offset < (end - lcore_start))
+				lflags = 1;
+		}
+
+		if (*end != ',' && *end != '\0')
+			goto err;
+
+		/* parse lcore_set from start point */
+		if (0 > eal_parse_set(lcore_start, set, RTE_DIM(set)))
+			goto err;
+
+		/* without '@', by default using lcore_set as cpu_set */
+		if (*lcores != '@' &&
+		    0 > convert_to_cpuset(&cpuset, set, RTE_DIM(set)))
+			goto err;
+
+		/* start to update lcore_set */
+		for (idx = 0; idx < RTE_MAX_LCORE; idx++) {
+			if (!set[idx])
+				continue;
+
+			if (cfg->lcore_role[idx] != ROLE_RTE) {
+				lcore_config[idx].core_index = count;
+				cfg->lcore_role[idx] = ROLE_RTE;
+				count++;
+			}
+
+			if (lflags) {
+				CPU_ZERO(&cpuset);
+				CPU_SET(idx, &cpuset);
+			}
+			rte_memcpy(&lcore_config[idx].cpuset, &cpuset,
+				   sizeof(rte_cpuset_t));
+		}
+
+		lcores = end + 1;
+	} while (*end != '\0');
+
+	if (count == 0)
+		goto err;
+
+	cfg->lcore_count = count;
+	lcores_parsed = 1;
+	ret = 0;
+
+err:
+
+	return ret;
+}
+
 static int
 eal_parse_syslog(const char *facility, struct internal_config *conf)
 {
@@ -492,6 +769,13 @@ eal_parse_common_option(int opt, const char *optarg,
 		conf->log_level = log;
 		break;
 	}
+	case OPT_LCORES_NUM:
+		if (eal_parse_lcores(optarg) < 0) {
+			RTE_LOG(ERR, EAL, "invalid parameter for --"
+				OPT_LCORES "\n");
+			return -1;
+		}
+		break;
 
 	/* don't know what to do, leave this to caller */
 	default:
@@ -530,7 +814,7 @@ eal_check_common_options(struct internal_config *internal_cfg)
 
 	if (!lcores_parsed) {
 		RTE_LOG(ERR, EAL, "CPU cores must be enabled with options "
-			"-c or -l\n");
+			"-c, -l or --lcores\n");
 		return -1;
 	}
 	if (cfg->lcore_role[cfg->master_lcore] != ROLE_RTE) {
@@ -586,6 +870,14 @@ eal_common_usage(void)
 	       "                 The argument format is <c1>[-c2][,c3[-c4],...]\n"
 	       "                 where c1, c2, etc are core indexes between 0 and %d\n"
 	       "  --"OPT_MASTER_LCORE" ID: Core ID that is used as master\n"
+	       "  --"OPT_LCORES" MAP: maps between lcore_set to phys_cpu_set\n"
+	       "                 The argument format is\n"
+	       "                       'lcores[@cpus]<,lcores[@cpus],...>'\n"
+	       "                 lcores and cpus list are grouped by '(' and ')'\n"
+	       "                 Within the group, '-' is used for range separator,\n"
+	       "                 ',' is used for single number separator.\n"
+	       "                 '( )' can be omitted for single element group, '@' \n"
+	       "                 can be omitted if cpus and lcores has the same value\n"
 	       "  -n NUM       : Number of memory channels\n"
 	       "  -v           : Display version information on startup\n"
 	       "  -m MB        : memory to allocate (see also --"OPT_SOCKET_MEM")\n"
diff --git a/lib/librte_eal/common/eal_options.h b/lib/librte_eal/common/eal_options.h
index e476f8d..a1cc59f 100644
--- a/lib/librte_eal/common/eal_options.h
+++ b/lib/librte_eal/common/eal_options.h
@@ -77,6 +77,8 @@ enum {
 	OPT_CREATE_UIO_DEV_NUM,
 #define OPT_VFIO_INTR    "vfio-intr"
 	OPT_VFIO_INTR_NUM,
+#define OPT_LCORES "lcores"
+	OPT_LCORES_NUM,
 	OPT_LONG_MAX_NUM
 };
 
diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile
index 0e9c447..025d836 100644
--- a/lib/librte_eal/linuxapp/eal/Makefile
+++ b/lib/librte_eal/linuxapp/eal/Makefile
@@ -95,6 +95,7 @@ CFLAGS_eal_hugepage_info.o := -D_GNU_SOURCE
 CFLAGS_eal_pci.o := -D_GNU_SOURCE
 CFLAGS_eal_pci_vfio.o := -D_GNU_SOURCE
 CFLAGS_eal_common_whitelist.o := -D_GNU_SOURCE
+CFLAGS_eal_common_options.o := -D_GNU_SOURCE
 
 # workaround for a gcc bug with noreturn attribute
 # http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v4 03/17] eal: fix wrong strnlen() return value in 32bit icc
  2015-02-02  2:02       ` [dpdk-dev] [PATCH v4 00/17] support multi-pthread per core Cunming Liang
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 01/17] eal: add cpuset into per EAL thread lcore_config Cunming Liang
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 02/17] eal: new eal option '--lcores' for cpu assignment Cunming Liang
@ 2015-02-02  2:02         ` Cunming Liang
  2015-02-08 19:59           ` Olivier MATZ
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 04/17] eal: add support parsing socket_id from cpuset Cunming Liang
                           ` (15 subsequent siblings)
  18 siblings, 1 reply; 253+ messages in thread
From: Cunming Liang @ 2015-02-02  2:02 UTC (permalink / raw)
  To: dev

The problem is that strnlen() here may return invalid value with 32bit icc.
(actually it returns it’s second parameter,e.g: sysconf(_SC_ARG_MAX)).
It starts to manifest hwen max_len parameter is > 2M and using icc –m32 –O2 (or above).

Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/common/eal_common_options.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_options.c b/lib/librte_eal/common/eal_common_options.c
index 29ebb6f..22d5d37 100644
--- a/lib/librte_eal/common/eal_common_options.c
+++ b/lib/librte_eal/common/eal_common_options.c
@@ -227,7 +227,7 @@ eal_parse_corelist(const char *corelist)
 	/* Remove all blank characters ahead and after */
 	while (isblank(*corelist))
 		corelist++;
-	i = strnlen(corelist, sysconf(_SC_ARG_MAX));
+	i = strnlen(corelist, PATH_MAX);
 	while ((i > 0) && isblank(corelist[i - 1]))
 		i--;
 
@@ -469,7 +469,7 @@ eal_parse_lcores(const char *lcores)
 	/* Remove all blank characters ahead and after */
 	while (isblank(*lcores))
 		lcores++;
-	i = strnlen(lcores, sysconf(_SC_ARG_MAX));
+	i = strnlen(lcores, PATH_MAX);
 	while ((i > 0) && isblank(lcores[i - 1]))
 		i--;
 
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v4 04/17] eal: add support parsing socket_id from cpuset
  2015-02-02  2:02       ` [dpdk-dev] [PATCH v4 00/17] support multi-pthread per core Cunming Liang
                           ` (2 preceding siblings ...)
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 03/17] eal: fix wrong strnlen() return value in 32bit icc Cunming Liang
@ 2015-02-02  2:02         ` Cunming Liang
  2015-02-08 20:00           ` Olivier MATZ
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 05/17] eal: new TLS definition and API declaration Cunming Liang
                           ` (14 subsequent siblings)
  18 siblings, 1 reply; 253+ messages in thread
From: Cunming Liang @ 2015-02-02  2:02 UTC (permalink / raw)
  To: dev

It returns the socket_id if all cpus in the cpuset belongs
to the same NUMA node, otherwise it will return SOCKET_ID_ANY.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/bsdapp/eal/eal_lcore.c   |  7 +++++
 lib/librte_eal/common/eal_thread.h      | 52 +++++++++++++++++++++++++++++++++
 lib/librte_eal/linuxapp/eal/eal_lcore.c |  7 +++++
 3 files changed, 66 insertions(+)

diff --git a/lib/librte_eal/bsdapp/eal/eal_lcore.c b/lib/librte_eal/bsdapp/eal/eal_lcore.c
index 72f8ac2..162fb4f 100644
--- a/lib/librte_eal/bsdapp/eal/eal_lcore.c
+++ b/lib/librte_eal/bsdapp/eal/eal_lcore.c
@@ -41,6 +41,7 @@
 #include <rte_debug.h>
 
 #include "eal_private.h"
+#include "eal_thread.h"
 
 /* No topology information available on FreeBSD including NUMA info */
 #define cpu_core_id(X) 0
@@ -112,3 +113,9 @@ rte_eal_cpu_init(void)
 
 	return 0;
 }
+
+unsigned
+eal_cpu_socket_id(__rte_unused unsigned cpu_id)
+{
+	return cpu_socket_id(cpu_id);
+}
diff --git a/lib/librte_eal/common/eal_thread.h b/lib/librte_eal/common/eal_thread.h
index b53b84d..a25ee86 100644
--- a/lib/librte_eal/common/eal_thread.h
+++ b/lib/librte_eal/common/eal_thread.h
@@ -34,6 +34,10 @@
 #ifndef EAL_THREAD_H
 #define EAL_THREAD_H
 
+#include <sched.h>
+
+#include <rte_debug.h>
+
 /**
  * basic loop of thread, called for each thread by eal_init().
  *
@@ -50,4 +54,52 @@ __attribute__((noreturn)) void *eal_thread_loop(void *arg);
  */
 void eal_thread_init_master(unsigned lcore_id);
 
+/**
+ * Get the NUMA socket id from cpu id.
+ * This function is private to EAL.
+ *
+ * @param cpu_id
+ *   The logical process id.
+ * @return
+ *   socket_id or SOCKET_ID_ANY
+ */
+unsigned eal_cpu_socket_id(unsigned cpu_id);
+
+/**
+ * Get the NUMA socket id from cpuset.
+ * This function is private to EAL.
+ *
+ * @param cpusetp
+ *   The point to a valid cpu set.
+ * @return
+ *   socket_id or SOCKET_ID_ANY
+ */
+static inline int
+eal_cpuset_socket_id(rte_cpuset_t *cpusetp)
+{
+	unsigned cpu = 0;
+	int socket_id = SOCKET_ID_ANY;
+	int sid;
+
+	if (cpusetp == NULL)
+		return SOCKET_ID_ANY;
+
+	do {
+		if (!CPU_ISSET(cpu, cpusetp))
+			continue;
+
+		if (socket_id == SOCKET_ID_ANY)
+			socket_id = eal_cpu_socket_id(cpu);
+
+		sid = eal_cpu_socket_id(cpu);
+		if (socket_id != sid) {
+			socket_id = SOCKET_ID_ANY;
+			break;
+		}
+
+	} while (++cpu < RTE_MAX_LCORE);
+
+	return socket_id;
+}
+
 #endif /* EAL_THREAD_H */
diff --git a/lib/librte_eal/linuxapp/eal/eal_lcore.c b/lib/librte_eal/linuxapp/eal/eal_lcore.c
index 29615f8..922af6d 100644
--- a/lib/librte_eal/linuxapp/eal/eal_lcore.c
+++ b/lib/librte_eal/linuxapp/eal/eal_lcore.c
@@ -45,6 +45,7 @@
 
 #include "eal_private.h"
 #include "eal_filesystem.h"
+#include "eal_thread.h"
 
 #define SYS_CPU_DIR "/sys/devices/system/cpu/cpu%u"
 #define CORE_ID_FILE "topology/core_id"
@@ -197,3 +198,9 @@ rte_eal_cpu_init(void)
 
 	return 0;
 }
+
+unsigned
+eal_cpu_socket_id(unsigned cpu_id)
+{
+	return cpu_socket_id(cpu_id);
+}
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v4 05/17] eal: new TLS definition and API declaration
  2015-02-02  2:02       ` [dpdk-dev] [PATCH v4 00/17] support multi-pthread per core Cunming Liang
                           ` (3 preceding siblings ...)
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 04/17] eal: add support parsing socket_id from cpuset Cunming Liang
@ 2015-02-02  2:02         ` Cunming Liang
  2015-02-08 20:00           ` Olivier MATZ
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 06/17] eal: add eal_common_thread.c for common thread API Cunming Liang
                           ` (13 subsequent siblings)
  18 siblings, 1 reply; 253+ messages in thread
From: Cunming Liang @ 2015-02-02  2:02 UTC (permalink / raw)
  To: dev

1. add two TLS *_socket_id* and *_cpuset*
2. add two external API rte_thread_set/get_affinity
3. add one internal API eal_thread_dump_affinity

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/bsdapp/eal/eal_thread.c    |  2 ++
 lib/librte_eal/common/eal_thread.h        | 14 ++++++++++++++
 lib/librte_eal/common/include/rte_lcore.h | 29 +++++++++++++++++++++++++++--
 lib/librte_eal/linuxapp/eal/eal_thread.c  |  2 ++
 4 files changed, 45 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c b/lib/librte_eal/bsdapp/eal/eal_thread.c
index ab05368..10220c7 100644
--- a/lib/librte_eal/bsdapp/eal/eal_thread.c
+++ b/lib/librte_eal/bsdapp/eal/eal_thread.c
@@ -56,6 +56,8 @@
 #include "eal_thread.h"
 
 RTE_DEFINE_PER_LCORE(unsigned, _lcore_id);
+RTE_DEFINE_PER_LCORE(unsigned, _socket_id);
+RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset);
 
 /*
  * Send a message to a slave lcore identified by slave_id to call a
diff --git a/lib/librte_eal/common/eal_thread.h b/lib/librte_eal/common/eal_thread.h
index a25ee86..28edf51 100644
--- a/lib/librte_eal/common/eal_thread.h
+++ b/lib/librte_eal/common/eal_thread.h
@@ -102,4 +102,18 @@ eal_cpuset_socket_id(rte_cpuset_t *cpusetp)
 	return socket_id;
 }
 
+/**
+ * Dump the current pthread cpuset.
+ * This function is private to EAL.
+ *
+ * @param str
+ *   The string buffer the cpuset will dump to.
+ * @param size
+ *   The string buffer size.
+ */
+#define CPU_STR_LEN            256
+void
+eal_thread_dump_affinity(char str[], unsigned size);
+
+
 #endif /* EAL_THREAD_H */
diff --git a/lib/librte_eal/common/include/rte_lcore.h b/lib/librte_eal/common/include/rte_lcore.h
index 4c7d6bb..facdbdc 100644
--- a/lib/librte_eal/common/include/rte_lcore.h
+++ b/lib/librte_eal/common/include/rte_lcore.h
@@ -43,6 +43,7 @@
 #include <rte_per_lcore.h>
 #include <rte_eal.h>
 #include <rte_launch.h>
+#include <rte_memory.h>
 
 #ifdef __cplusplus
 extern "C" {
@@ -80,7 +81,9 @@ struct lcore_config {
  */
 extern struct lcore_config lcore_config[RTE_MAX_LCORE];
 
-RTE_DECLARE_PER_LCORE(unsigned, _lcore_id); /**< Per core "core id". */
+RTE_DECLARE_PER_LCORE(unsigned, _lcore_id);  /**< Per thread "lcore id". */
+RTE_DECLARE_PER_LCORE(unsigned, _socket_id); /**< Per thread "socket id". */
+RTE_DECLARE_PER_LCORE(rte_cpuset_t, _cpuset); /**< Per thread "cpuset". */
 
 /**
  * Return the ID of the execution unit we are running on.
@@ -146,7 +149,7 @@ rte_lcore_index(int lcore_id)
 static inline unsigned
 rte_socket_id(void)
 {
-	return lcore_config[rte_lcore_id()].socket_id;
+	return RTE_PER_LCORE(_socket_id);
 }
 
 /**
@@ -229,6 +232,28 @@ rte_get_next_lcore(unsigned i, int skip_master, int wrap)
 	     i<RTE_MAX_LCORE;						\
 	     i = rte_get_next_lcore(i, 1, 0))
 
+/**
+ * Set core affinity of the current thread.
+ * Support both EAL and none-EAL thread and update TLS.
+ *
+ * @param cpusetp
+ *   Point to cpu_set_t for setting current thread affinity.
+ * @return
+ *   On success, return 0; otherwise return -1;
+ */
+int rte_thread_set_affinity(rte_cpuset_t *cpusetp);
+
+/**
+ * Get core affinity of the current thread.
+ *
+ * @param cpusetp
+ *   Point to cpu_set_t for getting current thread cpu affinity.
+ * @return
+ *   On success, return 0; otherwise return -1;
+ */
+int rte_thread_get_affinity(rte_cpuset_t *cpusetp);
+
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_eal/linuxapp/eal/eal_thread.c b/lib/librte_eal/linuxapp/eal/eal_thread.c
index 80a985f..748a83a 100644
--- a/lib/librte_eal/linuxapp/eal/eal_thread.c
+++ b/lib/librte_eal/linuxapp/eal/eal_thread.c
@@ -56,6 +56,8 @@
 #include "eal_thread.h"
 
 RTE_DEFINE_PER_LCORE(unsigned, _lcore_id);
+RTE_DEFINE_PER_LCORE(unsigned, _socket_id);
+RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset);
 
 /*
  * Send a message to a slave lcore identified by slave_id to call a
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v4 06/17] eal: add eal_common_thread.c for common thread API
  2015-02-02  2:02       ` [dpdk-dev] [PATCH v4 00/17] support multi-pthread per core Cunming Liang
                           ` (4 preceding siblings ...)
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 05/17] eal: new TLS definition and API declaration Cunming Liang
@ 2015-02-02  2:02         ` Cunming Liang
  2015-02-08 20:00           ` Olivier MATZ
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 07/17] eal: add rte_gettid() to acquire unique system tid Cunming Liang
                           ` (12 subsequent siblings)
  18 siblings, 1 reply; 253+ messages in thread
From: Cunming Liang @ 2015-02-02  2:02 UTC (permalink / raw)
  To: dev

The API works for both EAL thread and none EAL thread.
When calling rte_thread_set_affinity, the *_socket_id* and
*_cpuset* of calling thread will be updated if the thread
successful set the cpu affinity.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/bsdapp/eal/Makefile        |   1 +
 lib/librte_eal/common/eal_common_thread.c | 142 ++++++++++++++++++++++++++++++
 lib/librte_eal/linuxapp/eal/Makefile      |   2 +
 3 files changed, 145 insertions(+)
 create mode 100644 lib/librte_eal/common/eal_common_thread.c

diff --git a/lib/librte_eal/bsdapp/eal/Makefile b/lib/librte_eal/bsdapp/eal/Makefile
index d434882..78406be 100644
--- a/lib/librte_eal/bsdapp/eal/Makefile
+++ b/lib/librte_eal/bsdapp/eal/Makefile
@@ -73,6 +73,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_hexdump.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_devargs.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_dev.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_options.c
+SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_thread.c
 
 CFLAGS_eal.o := -D_GNU_SOURCE
 #CFLAGS_eal_thread.o := -D_GNU_SOURCE
diff --git a/lib/librte_eal/common/eal_common_thread.c b/lib/librte_eal/common/eal_common_thread.c
new file mode 100644
index 0000000..d996690
--- /dev/null
+++ b/lib/librte_eal/common/eal_common_thread.c
@@ -0,0 +1,142 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <unistd.h>
+#include <pthread.h>
+#include <sched.h>
+
+#include <rte_lcore.h>
+#include <rte_log.h>
+#include <rte_memcpy.h>
+
+#include "eal_thread.h"
+
+int
+rte_thread_set_affinity(rte_cpuset_t *cpusetp)
+{
+	int s;
+	unsigned lcore_id;
+	pthread_t tid;
+
+	if (!cpusetp)
+		return -1;
+
+	lcore_id = rte_lcore_id();
+	if (lcore_id != (unsigned)LCORE_ID_ANY) {
+		/* EAL thread */
+		tid = lcore_config[lcore_id].thread_id;
+
+		s = pthread_setaffinity_np(tid, sizeof(rte_cpuset_t), cpusetp);
+		if (s != 0) {
+			RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
+			return -1;
+		}
+
+		/* store socket_id in TLS for quick access */
+		RTE_PER_LCORE(_socket_id) =
+			eal_cpuset_socket_id(cpusetp);
+
+		/* store cpuset in TLS for quick access */
+		rte_memcpy(&RTE_PER_LCORE(_cpuset), cpusetp,
+			   sizeof(rte_cpuset_t));
+
+		/* update lcore_config */
+		lcore_config[lcore_id].socket_id = RTE_PER_LCORE(_socket_id);
+		rte_memcpy(&lcore_config[lcore_id].cpuset, cpusetp,
+			   sizeof(rte_cpuset_t));
+	} else {
+		/* none EAL thread */
+		tid = pthread_self();
+
+		s = pthread_setaffinity_np(tid, sizeof(rte_cpuset_t), cpusetp);
+		if (s != 0) {
+			RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
+			return -1;
+		}
+
+		/* store cpuset in TLS for quick access */
+		rte_memcpy(&RTE_PER_LCORE(_cpuset), cpusetp,
+			   sizeof(rte_cpuset_t));
+
+		/* store socket_id in TLS for quick access */
+		RTE_PER_LCORE(_socket_id) =
+			eal_cpuset_socket_id(cpusetp);
+	}
+
+	return 0;
+}
+
+int
+rte_thread_get_affinity(rte_cpuset_t *cpusetp)
+{
+	if (!cpusetp)
+		return -1;
+
+	rte_memcpy(cpusetp, &RTE_PER_LCORE(_cpuset),
+		   sizeof(rte_cpuset_t));
+
+	return 0;
+}
+
+void
+eal_thread_dump_affinity(char str[], unsigned size)
+{
+	rte_cpuset_t cpuset;
+	unsigned cpu;
+	int ret;
+	unsigned int out = 0;
+
+	if (rte_thread_get_affinity(&cpuset) < 0) {
+		str[0] = '\0';
+		return;
+	}
+
+	for (cpu = 0; cpu < RTE_MAX_LCORE; cpu++) {
+		if (!CPU_ISSET(cpu, &cpuset))
+			continue;
+
+		ret = snprintf(str + out,
+			       size - out, "%u,", cpu);
+		if (ret < 0 || (unsigned)ret >= size - out)
+			break;
+
+		out += ret;
+	}
+
+	/* remove the last separator */
+	if (out > 0)
+		str[out - 1] = '\0';
+}
diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile
index 025d836..07e21ca 100644
--- a/lib/librte_eal/linuxapp/eal/Makefile
+++ b/lib/librte_eal/linuxapp/eal/Makefile
@@ -85,6 +85,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_common_hexdump.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_common_devargs.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_common_dev.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_common_options.c
+SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_common_thread.c
 
 CFLAGS_eal.o := -D_GNU_SOURCE
 CFLAGS_eal_lcore.o := -D_GNU_SOURCE
@@ -96,6 +97,7 @@ CFLAGS_eal_pci.o := -D_GNU_SOURCE
 CFLAGS_eal_pci_vfio.o := -D_GNU_SOURCE
 CFLAGS_eal_common_whitelist.o := -D_GNU_SOURCE
 CFLAGS_eal_common_options.o := -D_GNU_SOURCE
+CFLAGS_eal_common_thread.o := -D_GNU_SOURCE
 
 # workaround for a gcc bug with noreturn attribute
 # http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v4 07/17] eal: add rte_gettid() to acquire unique system tid
  2015-02-02  2:02       ` [dpdk-dev] [PATCH v4 00/17] support multi-pthread per core Cunming Liang
                           ` (5 preceding siblings ...)
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 06/17] eal: add eal_common_thread.c for common thread API Cunming Liang
@ 2015-02-02  2:02         ` Cunming Liang
  2015-02-08 20:00           ` Olivier MATZ
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 08/17] eal: apply affinity of EAL thread by assigned cpuset Cunming Liang
                           ` (11 subsequent siblings)
  18 siblings, 1 reply; 253+ messages in thread
From: Cunming Liang @ 2015-02-02  2:02 UTC (permalink / raw)
  To: dev

The rte_gettid() wraps the linux and freebsd syscall gettid().
It provides a persistent unique thread id for the calling thread.
It will save the unique id in TLS on the first time.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/bsdapp/eal/eal_thread.c   |  9 +++++++++
 lib/librte_eal/common/include/rte_eal.h  | 27 +++++++++++++++++++++++++++
 lib/librte_eal/linuxapp/eal/eal_thread.c |  7 +++++++
 3 files changed, 43 insertions(+)

diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c b/lib/librte_eal/bsdapp/eal/eal_thread.c
index 10220c7..d0c077b 100644
--- a/lib/librte_eal/bsdapp/eal/eal_thread.c
+++ b/lib/librte_eal/bsdapp/eal/eal_thread.c
@@ -39,6 +39,7 @@
 #include <sched.h>
 #include <pthread_np.h>
 #include <sys/queue.h>
+#include <sys/thr.h>
 
 #include <rte_debug.h>
 #include <rte_atomic.h>
@@ -233,3 +234,11 @@ eal_thread_loop(__attribute__((unused)) void *arg)
 	/* pthread_exit(NULL); */
 	/* return NULL; */
 }
+
+/* require calling thread tid by gettid() */
+int rte_sys_gettid(void)
+{
+	long lwpid;
+	thr_self(&lwpid);
+	return (int)lwpid;
+}
diff --git a/lib/librte_eal/common/include/rte_eal.h b/lib/librte_eal/common/include/rte_eal.h
index f4ecd2e..8ccdd65 100644
--- a/lib/librte_eal/common/include/rte_eal.h
+++ b/lib/librte_eal/common/include/rte_eal.h
@@ -41,6 +41,9 @@
  */
 
 #include <stdint.h>
+#include <sched.h>
+
+#include <rte_per_lcore.h>
 
 #ifdef __cplusplus
 extern "C" {
@@ -262,6 +265,30 @@ rte_set_application_usage_hook( rte_usage_hook_t usage_func );
  */
 int rte_eal_has_hugepages(void);
 
+/**
+ * A wrap API for syscall gettid.
+ *
+ * @return
+ *   On success, returns the thread ID of calling process.
+ *   It always successful.
+ */
+int rte_sys_gettid(void);
+
+/**
+ * Get system unique thread id.
+ *
+ * @return
+ *   On success, returns the thread ID of calling process.
+ *   It always successful.
+ */
+static inline int rte_gettid(void)
+{
+	static RTE_DEFINE_PER_LCORE(int, _thread_id) = -1;
+	if (RTE_PER_LCORE(_thread_id) == -1)
+		RTE_PER_LCORE(_thread_id) = rte_sys_gettid();
+	return RTE_PER_LCORE(_thread_id);
+}
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_eal/linuxapp/eal/eal_thread.c b/lib/librte_eal/linuxapp/eal/eal_thread.c
index 748a83a..ed20c93 100644
--- a/lib/librte_eal/linuxapp/eal/eal_thread.c
+++ b/lib/librte_eal/linuxapp/eal/eal_thread.c
@@ -39,6 +39,7 @@
 #include <pthread.h>
 #include <sched.h>
 #include <sys/queue.h>
+#include <sys/syscall.h>
 
 #include <rte_debug.h>
 #include <rte_atomic.h>
@@ -233,3 +234,9 @@ eal_thread_loop(__attribute__((unused)) void *arg)
 	/* pthread_exit(NULL); */
 	/* return NULL; */
 }
+
+/* require calling thread tid by gettid() */
+int rte_sys_gettid(void)
+{
+	return (int)syscall(SYS_gettid);
+}
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v4 08/17] eal: apply affinity of EAL thread by assigned cpuset
  2015-02-02  2:02       ` [dpdk-dev] [PATCH v4 00/17] support multi-pthread per core Cunming Liang
                           ` (6 preceding siblings ...)
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 07/17] eal: add rte_gettid() to acquire unique system tid Cunming Liang
@ 2015-02-02  2:02         ` Cunming Liang
  2015-02-08 20:00           ` Olivier MATZ
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 09/17] enic: fix re-define freebsd compile complain Cunming Liang
                           ` (10 subsequent siblings)
  18 siblings, 1 reply; 253+ messages in thread
From: Cunming Liang @ 2015-02-02  2:02 UTC (permalink / raw)
  To: dev

EAL threads use assigned cpuset to set core affinity during startup.
It keeps 1:1 mapping, if no '--lcores' option is used.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/bsdapp/eal/eal.c          | 13 ++++---
 lib/librte_eal/bsdapp/eal/eal_thread.c   | 63 +++++++++---------------------
 lib/librte_eal/linuxapp/eal/eal.c        |  7 +++-
 lib/librte_eal/linuxapp/eal/eal_thread.c | 67 +++++++++++---------------------
 4 files changed, 54 insertions(+), 96 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
index 69f3c03..98c5a83 100644
--- a/lib/librte_eal/bsdapp/eal/eal.c
+++ b/lib/librte_eal/bsdapp/eal/eal.c
@@ -432,6 +432,7 @@ rte_eal_init(int argc, char **argv)
 	int i, fctret, ret;
 	pthread_t thread_id;
 	static rte_atomic32_t run_once = RTE_ATOMIC32_INIT(0);
+	char cpuset[CPU_STR_LEN];
 
 	if (!rte_atomic32_test_and_set(&run_once))
 		return -1;
@@ -502,13 +503,17 @@ rte_eal_init(int argc, char **argv)
 	if (rte_eal_pci_init() < 0)
 		rte_panic("Cannot init PCI\n");
 
-	RTE_LOG(DEBUG, EAL, "Master core %u is ready (tid=%p)\n",
-		rte_config.master_lcore, thread_id);
-
 	eal_check_mem_on_local_socket();
 
 	rte_eal_mcfg_complete();
 
+	eal_thread_init_master(rte_config.master_lcore);
+
+	eal_thread_dump_affinity(cpuset, CPU_STR_LEN);
+
+	RTE_LOG(DEBUG, EAL, "Master lcore %u is ready (tid=%p;cpuset=[%s])\n",
+		rte_config.master_lcore, thread_id, cpuset);
+
 	if (rte_eal_dev_init() < 0)
 		rte_panic("Cannot init pmd devices\n");
 
@@ -532,8 +537,6 @@ rte_eal_init(int argc, char **argv)
 			rte_panic("Cannot create thread\n");
 	}
 
-	eal_thread_init_master(rte_config.master_lcore);
-
 	/*
 	 * Launch a dummy function on all slave lcores, so that master lcore
 	 * knows they are all ready when this function returns.
diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c b/lib/librte_eal/bsdapp/eal/eal_thread.c
index d0c077b..5b16302 100644
--- a/lib/librte_eal/bsdapp/eal/eal_thread.c
+++ b/lib/librte_eal/bsdapp/eal/eal_thread.c
@@ -103,55 +103,27 @@ eal_thread_set_affinity(void)
 {
 	int s;
 	pthread_t thread;
-
-/*
- * According to the section VERSIONS of the CPU_ALLOC man page:
- *
- * The CPU_ZERO(), CPU_SET(), CPU_CLR(), and CPU_ISSET() macros were added
- * in glibc 2.3.3.
- *
- * CPU_COUNT() first appeared in glibc 2.6.
- *
- * CPU_AND(),     CPU_OR(),     CPU_XOR(),    CPU_EQUAL(),    CPU_ALLOC(),
- * CPU_ALLOC_SIZE(), CPU_FREE(), CPU_ZERO_S(),  CPU_SET_S(),  CPU_CLR_S(),
- * CPU_ISSET_S(),  CPU_AND_S(), CPU_OR_S(), CPU_XOR_S(), and CPU_EQUAL_S()
- * first appeared in glibc 2.7.
- */
-#if defined(CPU_ALLOC)
-	size_t size;
-	cpu_set_t *cpusetp;
-
-	cpusetp = CPU_ALLOC(RTE_MAX_LCORE);
-	if (cpusetp == NULL) {
-		RTE_LOG(ERR, EAL, "CPU_ALLOC failed\n");
-		return -1;
-	}
-
-	size = CPU_ALLOC_SIZE(RTE_MAX_LCORE);
-	CPU_ZERO_S(size, cpusetp);
-	CPU_SET_S(rte_lcore_id(), size, cpusetp);
+	unsigned lcore_id = rte_lcore_id();
 
 	thread = pthread_self();
-	s = pthread_setaffinity_np(thread, size, cpusetp);
+	s = pthread_setaffinity_np(thread, sizeof(cpuset_t),
+				   &lcore_config[lcore_id].cpuset);
 	if (s != 0) {
 		RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
-		CPU_FREE(cpusetp);
 		return -1;
 	}
 
-	CPU_FREE(cpusetp);
-#else /* CPU_ALLOC */
-	cpuset_t cpuset;
-	CPU_ZERO( &cpuset );
-	CPU_SET( rte_lcore_id(), &cpuset );
+	/* acquire system unique id  */
+	rte_gettid();
+
+	/* store socket_id in TLS for quick access */
+	RTE_PER_LCORE(_socket_id) =
+		eal_cpuset_socket_id(&lcore_config[lcore_id].cpuset);
+
+	CPU_COPY(&lcore_config[lcore_id].cpuset, &RTE_PER_LCORE(_cpuset));
+
+	lcore_config[lcore_id].socket_id = RTE_PER_LCORE(_socket_id);
 
-	thread = pthread_self();
-	s = pthread_setaffinity_np(thread, sizeof( cpuset ), &cpuset);
-	if (s != 0) {
-		RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
-		return -1;
-	}
-#endif
 	return 0;
 }
 
@@ -174,6 +146,7 @@ eal_thread_loop(__attribute__((unused)) void *arg)
 	unsigned lcore_id;
 	pthread_t thread_id;
 	int m2s, s2m;
+	char cpuset[CPU_STR_LEN];
 
 	thread_id = pthread_self();
 
@@ -185,9 +158,6 @@ eal_thread_loop(__attribute__((unused)) void *arg)
 	if (lcore_id == RTE_MAX_LCORE)
 		rte_panic("cannot retrieve lcore id\n");
 
-	RTE_LOG(DEBUG, EAL, "Core %u is ready (tid=%p)\n",
-		lcore_id, thread_id);
-
 	m2s = lcore_config[lcore_id].pipe_master2slave[0];
 	s2m = lcore_config[lcore_id].pipe_slave2master[1];
 
@@ -198,6 +168,11 @@ eal_thread_loop(__attribute__((unused)) void *arg)
 	if (eal_thread_set_affinity() < 0)
 		rte_panic("cannot set affinity\n");
 
+	eal_thread_dump_affinity(cpuset, CPU_STR_LEN);
+
+	RTE_LOG(DEBUG, EAL, "lcore %u is ready (tid=%p;cpuset=[%s])\n",
+		lcore_id, thread_id, cpuset);
+
 	/* read on our pipe to get commands */
 	while (1) {
 		void *fct_arg;
diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index f99e158..c95adec 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -702,6 +702,7 @@ rte_eal_init(int argc, char **argv)
 	static rte_atomic32_t run_once = RTE_ATOMIC32_INIT(0);
 	struct shared_driver *solib = NULL;
 	const char *logid;
+	char cpuset[CPU_STR_LEN];
 
 	if (!rte_atomic32_test_and_set(&run_once))
 		return -1;
@@ -802,8 +803,10 @@ rte_eal_init(int argc, char **argv)
 
 	eal_thread_init_master(rte_config.master_lcore);
 
-	RTE_LOG(DEBUG, EAL, "Master core %u is ready (tid=%x)\n",
-		rte_config.master_lcore, (int)thread_id);
+	eal_thread_dump_affinity(cpuset, CPU_STR_LEN);
+
+	RTE_LOG(DEBUG, EAL, "Master lcore %u is ready (tid=%x;cpuset=[%s])\n",
+		rte_config.master_lcore, (int)thread_id, cpuset);
 
 	if (rte_eal_dev_init() < 0)
 		rte_panic("Cannot init pmd devices\n");
diff --git a/lib/librte_eal/linuxapp/eal/eal_thread.c b/lib/librte_eal/linuxapp/eal/eal_thread.c
index ed20c93..6eb1525 100644
--- a/lib/librte_eal/linuxapp/eal/eal_thread.c
+++ b/lib/librte_eal/linuxapp/eal/eal_thread.c
@@ -52,6 +52,7 @@
 #include <rte_eal.h>
 #include <rte_per_lcore.h>
 #include <rte_lcore.h>
+#include <rte_memcpy.h>
 
 #include "eal_private.h"
 #include "eal_thread.h"
@@ -97,61 +98,34 @@ rte_eal_remote_launch(int (*f)(void *), void *arg, unsigned slave_id)
 	return 0;
 }
 
-/* set affinity for current thread */
+/* set affinity for current EAL thread */
 static int
 eal_thread_set_affinity(void)
 {
 	int s;
 	pthread_t thread;
-
-/*
- * According to the section VERSIONS of the CPU_ALLOC man page:
- *
- * The CPU_ZERO(), CPU_SET(), CPU_CLR(), and CPU_ISSET() macros were added
- * in glibc 2.3.3.
- *
- * CPU_COUNT() first appeared in glibc 2.6.
- *
- * CPU_AND(),     CPU_OR(),     CPU_XOR(),    CPU_EQUAL(),    CPU_ALLOC(),
- * CPU_ALLOC_SIZE(), CPU_FREE(), CPU_ZERO_S(),  CPU_SET_S(),  CPU_CLR_S(),
- * CPU_ISSET_S(),  CPU_AND_S(), CPU_OR_S(), CPU_XOR_S(), and CPU_EQUAL_S()
- * first appeared in glibc 2.7.
- */
-#if defined(CPU_ALLOC)
-	size_t size;
-	cpu_set_t *cpusetp;
-
-	cpusetp = CPU_ALLOC(RTE_MAX_LCORE);
-	if (cpusetp == NULL) {
-		RTE_LOG(ERR, EAL, "CPU_ALLOC failed\n");
-		return -1;
-	}
-
-	size = CPU_ALLOC_SIZE(RTE_MAX_LCORE);
-	CPU_ZERO_S(size, cpusetp);
-	CPU_SET_S(rte_lcore_id(), size, cpusetp);
+	unsigned lcore_id = rte_lcore_id();
 
 	thread = pthread_self();
-	s = pthread_setaffinity_np(thread, size, cpusetp);
+	s = pthread_setaffinity_np(thread, sizeof(cpu_set_t),
+				   &lcore_config[lcore_id].cpuset);
 	if (s != 0) {
 		RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
-		CPU_FREE(cpusetp);
 		return -1;
 	}
 
-	CPU_FREE(cpusetp);
-#else /* CPU_ALLOC */
-	cpu_set_t cpuset;
-	CPU_ZERO( &cpuset );
-	CPU_SET( rte_lcore_id(), &cpuset );
+	/* acquire system unique id  */
+	rte_gettid();
+
+	/* store socket_id in TLS for quick access */
+	RTE_PER_LCORE(_socket_id) =
+		eal_cpuset_socket_id(&lcore_config[lcore_id].cpuset);
+
+	rte_memcpy(&RTE_PER_LCORE(_cpuset),
+		   &lcore_config[lcore_id].cpuset, sizeof(rte_cpuset_t));
+
+	lcore_config[lcore_id].socket_id = RTE_PER_LCORE(_socket_id);
 
-	thread = pthread_self();
-	s = pthread_setaffinity_np(thread, sizeof( cpuset ), &cpuset);
-	if (s != 0) {
-		RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
-		return -1;
-	}
-#endif
 	return 0;
 }
 
@@ -174,6 +148,7 @@ eal_thread_loop(__attribute__((unused)) void *arg)
 	unsigned lcore_id;
 	pthread_t thread_id;
 	int m2s, s2m;
+	char cpuset[CPU_STR_LEN];
 
 	thread_id = pthread_self();
 
@@ -185,9 +160,6 @@ eal_thread_loop(__attribute__((unused)) void *arg)
 	if (lcore_id == RTE_MAX_LCORE)
 		rte_panic("cannot retrieve lcore id\n");
 
-	RTE_LOG(DEBUG, EAL, "Core %u is ready (tid=%x)\n",
-		lcore_id, (int)thread_id);
-
 	m2s = lcore_config[lcore_id].pipe_master2slave[0];
 	s2m = lcore_config[lcore_id].pipe_slave2master[1];
 
@@ -198,6 +170,11 @@ eal_thread_loop(__attribute__((unused)) void *arg)
 	if (eal_thread_set_affinity() < 0)
 		rte_panic("cannot set affinity\n");
 
+	eal_thread_dump_affinity(cpuset, CPU_STR_LEN);
+
+	RTE_LOG(DEBUG, EAL, "lcore %u is ready (tid=%x;cpuset=[%s])\n",
+		lcore_id, (int)thread_id, cpuset);
+
 	/* read on our pipe to get commands */
 	while (1) {
 		void *fct_arg;
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v4 09/17] enic: fix re-define freebsd compile complain
  2015-02-02  2:02       ` [dpdk-dev] [PATCH v4 00/17] support multi-pthread per core Cunming Liang
                           ` (7 preceding siblings ...)
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 08/17] eal: apply affinity of EAL thread by assigned cpuset Cunming Liang
@ 2015-02-02  2:02         ` Cunming Liang
  2015-02-08 20:00           ` Olivier MATZ
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 10/17] malloc: fix the issue of SOCKET_ID_ANY Cunming Liang
                           ` (9 subsequent siblings)
  18 siblings, 1 reply; 253+ messages in thread
From: Cunming Liang @ 2015-02-02  2:02 UTC (permalink / raw)
  To: dev

Some macro already been defined by freebsd 'sys/param.h'.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_pmd_enic/enic.h        | 1 +
 lib/librte_pmd_enic/enic_compat.h | 1 +
 2 files changed, 2 insertions(+)

diff --git a/lib/librte_pmd_enic/enic.h b/lib/librte_pmd_enic/enic.h
index c43417c..189c3b9 100644
--- a/lib/librte_pmd_enic/enic.h
+++ b/lib/librte_pmd_enic/enic.h
@@ -66,6 +66,7 @@
 #define ENIC_CALC_IP_CKSUM      1
 #define ENIC_CALC_TCP_UDP_CKSUM 2
 #define ENIC_MAX_MTU            9000
+#undef PAGE_SIZE
 #define PAGE_SIZE               4096
 #define PAGE_ROUND_UP(x) \
 	((((unsigned long)(x)) + PAGE_SIZE-1) & (~(PAGE_SIZE-1)))
diff --git a/lib/librte_pmd_enic/enic_compat.h b/lib/librte_pmd_enic/enic_compat.h
index b1af838..b84c766 100644
--- a/lib/librte_pmd_enic/enic_compat.h
+++ b/lib/librte_pmd_enic/enic_compat.h
@@ -67,6 +67,7 @@
 #define pr_warn(y, args...) dev_warning(0, y, ##args)
 #define BUG() pr_err("BUG at %s:%d", __func__, __LINE__)
 
+#undef ALIGN
 #define ALIGN(x, a)              __ALIGN_MASK(x, (typeof(x))(a)-1)
 #define __ALIGN_MASK(x, mask)    (((x)+(mask))&~(mask))
 #define udelay usleep
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v4 10/17] malloc: fix the issue of SOCKET_ID_ANY
  2015-02-02  2:02       ` [dpdk-dev] [PATCH v4 00/17] support multi-pthread per core Cunming Liang
                           ` (8 preceding siblings ...)
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 09/17] enic: fix re-define freebsd compile complain Cunming Liang
@ 2015-02-02  2:02         ` Cunming Liang
  2015-02-08 20:00           ` Olivier MATZ
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 11/17] log: fix the gap to support non-EAL thread Cunming Liang
                           ` (8 subsequent siblings)
  18 siblings, 1 reply; 253+ messages in thread
From: Cunming Liang @ 2015-02-02  2:02 UTC (permalink / raw)
  To: dev

Add check for rte_socket_id(), avoid get unexpected return like (-1).

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_malloc/malloc_heap.h | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/lib/librte_malloc/malloc_heap.h b/lib/librte_malloc/malloc_heap.h
index b4aec45..a47136d 100644
--- a/lib/librte_malloc/malloc_heap.h
+++ b/lib/librte_malloc/malloc_heap.h
@@ -44,7 +44,12 @@ extern "C" {
 static inline unsigned
 malloc_get_numa_socket(void)
 {
-	return rte_socket_id();
+	unsigned socket_id = rte_socket_id();
+
+	if (socket_id == (unsigned)SOCKET_ID_ANY)
+		return 0;
+
+	return socket_id;
 }
 
 void *
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v4 11/17] log: fix the gap to support non-EAL thread
  2015-02-02  2:02       ` [dpdk-dev] [PATCH v4 00/17] support multi-pthread per core Cunming Liang
                           ` (9 preceding siblings ...)
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 10/17] malloc: fix the issue of SOCKET_ID_ANY Cunming Liang
@ 2015-02-02  2:02         ` Cunming Liang
  2015-02-08 20:01           ` Olivier MATZ
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 12/17] eal: set _lcore_id and _socket_id to (-1) by default Cunming Liang
                           ` (7 subsequent siblings)
  18 siblings, 1 reply; 253+ messages in thread
From: Cunming Liang @ 2015-02-02  2:02 UTC (permalink / raw)
  To: dev

For those non-EAL thread, *_lcore_id* is invalid and probably larger than RTE_MAX_LCORE.
The patch adds the check and allows only EAL thread using EAL per thread log level and log type.
Others shares the global log level.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/common/eal_common_log.c  | 17 +++++++++++++++--
 lib/librte_eal/common/include/rte_log.h |  5 +++++
 2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_log.c b/lib/librte_eal/common/eal_common_log.c
index cf57619..e8dc94a 100644
--- a/lib/librte_eal/common/eal_common_log.c
+++ b/lib/librte_eal/common/eal_common_log.c
@@ -193,11 +193,20 @@ rte_set_log_type(uint32_t type, int enable)
 		rte_logs.type &= (~type);
 }
 
+/* Get global log type */
+uint32_t
+rte_get_log_type(void)
+{
+	return rte_logs.type;
+}
+
 /* get the current loglevel for the message beeing processed */
 int rte_log_cur_msg_loglevel(void)
 {
 	unsigned lcore_id;
 	lcore_id = rte_lcore_id();
+	if (lcore_id >= RTE_MAX_LCORE)
+		return rte_get_log_level();
 	return log_cur_msg[lcore_id].loglevel;
 }
 
@@ -206,6 +215,8 @@ int rte_log_cur_msg_logtype(void)
 {
 	unsigned lcore_id;
 	lcore_id = rte_lcore_id();
+	if (lcore_id >= RTE_MAX_LCORE)
+		return rte_get_log_type();
 	return log_cur_msg[lcore_id].logtype;
 }
 
@@ -265,8 +276,10 @@ rte_vlog(__attribute__((unused)) uint32_t level,
 
 	/* save loglevel and logtype in a global per-lcore variable */
 	lcore_id = rte_lcore_id();
-	log_cur_msg[lcore_id].loglevel = level;
-	log_cur_msg[lcore_id].logtype = logtype;
+	if (lcore_id < RTE_MAX_LCORE) {
+		log_cur_msg[lcore_id].loglevel = level;
+		log_cur_msg[lcore_id].logtype = logtype;
+	}
 
 	ret = vfprintf(f, format, ap);
 	fflush(f);
diff --git a/lib/librte_eal/common/include/rte_log.h b/lib/librte_eal/common/include/rte_log.h
index db1ea08..f83a0d9 100644
--- a/lib/librte_eal/common/include/rte_log.h
+++ b/lib/librte_eal/common/include/rte_log.h
@@ -144,6 +144,11 @@ uint32_t rte_get_log_level(void);
 void rte_set_log_type(uint32_t type, int enable);
 
 /**
+ * Get the global log type.
+ */
+uint32_t rte_get_log_type(void);
+
+/**
  * Get the current loglevel for the message being processed.
  *
  * Before calling the user-defined stream for logging, the log
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v4 12/17] eal: set _lcore_id and _socket_id to (-1) by default
  2015-02-02  2:02       ` [dpdk-dev] [PATCH v4 00/17] support multi-pthread per core Cunming Liang
                           ` (10 preceding siblings ...)
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 11/17] log: fix the gap to support non-EAL thread Cunming Liang
@ 2015-02-02  2:02         ` Cunming Liang
  2015-02-08 20:01           ` Olivier MATZ
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 13/17] eal: fix recursive spinlock in non-EAL thraed Cunming Liang
                           ` (6 subsequent siblings)
  18 siblings, 1 reply; 253+ messages in thread
From: Cunming Liang @ 2015-02-02  2:02 UTC (permalink / raw)
  To: dev

For those none EAL thread, *_lcore_id* shall always be LCORE_ID_ANY.
The libraries using *_lcore_id* as index need to take care.
*_socket_id* always be SOCKET_ID_ANY unitl the thread changes the affinity
by rte_thread_set_affinity()

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/bsdapp/eal/eal_thread.c   | 4 ++--
 lib/librte_eal/linuxapp/eal/eal_thread.c | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c b/lib/librte_eal/bsdapp/eal/eal_thread.c
index 5b16302..2b3c9a8 100644
--- a/lib/librte_eal/bsdapp/eal/eal_thread.c
+++ b/lib/librte_eal/bsdapp/eal/eal_thread.c
@@ -56,8 +56,8 @@
 #include "eal_private.h"
 #include "eal_thread.h"
 
-RTE_DEFINE_PER_LCORE(unsigned, _lcore_id);
-RTE_DEFINE_PER_LCORE(unsigned, _socket_id);
+RTE_DEFINE_PER_LCORE(unsigned, _lcore_id) = (unsigned)LCORE_ID_ANY;
+RTE_DEFINE_PER_LCORE(unsigned, _socket_id) = (unsigned)SOCKET_ID_ANY;
 RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset);
 
 /*
diff --git a/lib/librte_eal/linuxapp/eal/eal_thread.c b/lib/librte_eal/linuxapp/eal/eal_thread.c
index 6eb1525..ab94e20 100644
--- a/lib/librte_eal/linuxapp/eal/eal_thread.c
+++ b/lib/librte_eal/linuxapp/eal/eal_thread.c
@@ -57,8 +57,8 @@
 #include "eal_private.h"
 #include "eal_thread.h"
 
-RTE_DEFINE_PER_LCORE(unsigned, _lcore_id);
-RTE_DEFINE_PER_LCORE(unsigned, _socket_id);
+RTE_DEFINE_PER_LCORE(unsigned, _lcore_id) = (unsigned)LCORE_ID_ANY;
+RTE_DEFINE_PER_LCORE(unsigned, _socket_id) = (unsigned)SOCKET_ID_ANY;
 RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset);
 
 /*
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v4 13/17] eal: fix recursive spinlock in non-EAL thraed
  2015-02-02  2:02       ` [dpdk-dev] [PATCH v4 00/17] support multi-pthread per core Cunming Liang
                           ` (11 preceding siblings ...)
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 12/17] eal: set _lcore_id and _socket_id to (-1) by default Cunming Liang
@ 2015-02-02  2:02         ` Cunming Liang
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 14/17] mempool: add support to non-EAL thread Cunming Liang
                           ` (5 subsequent siblings)
  18 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-02-02  2:02 UTC (permalink / raw)
  To: dev

In non-EAL thread, lcore_id alrways be LCORE_ID_ANY.
It cann't be used as unique id for recursive spinlock.
Then use rte_gettid() to replace it.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/common/include/generic/rte_spinlock.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/include/generic/rte_spinlock.h b/lib/librte_eal/common/include/generic/rte_spinlock.h
index dea885c..c7fb0df 100644
--- a/lib/librte_eal/common/include/generic/rte_spinlock.h
+++ b/lib/librte_eal/common/include/generic/rte_spinlock.h
@@ -179,7 +179,7 @@ static inline void rte_spinlock_recursive_init(rte_spinlock_recursive_t *slr)
  */
 static inline void rte_spinlock_recursive_lock(rte_spinlock_recursive_t *slr)
 {
-	int id = rte_lcore_id();
+	int id = rte_gettid();
 
 	if (slr->user != id) {
 		rte_spinlock_lock(&slr->sl);
@@ -212,7 +212,7 @@ static inline void rte_spinlock_recursive_unlock(rte_spinlock_recursive_t *slr)
  */
 static inline int rte_spinlock_recursive_trylock(rte_spinlock_recursive_t *slr)
 {
-	int id = rte_lcore_id();
+	int id = rte_gettid();
 
 	if (slr->user != id) {
 		if (rte_spinlock_trylock(&slr->sl) == 0)
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v4 14/17] mempool: add support to non-EAL thread
  2015-02-02  2:02       ` [dpdk-dev] [PATCH v4 00/17] support multi-pthread per core Cunming Liang
                           ` (12 preceding siblings ...)
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 13/17] eal: fix recursive spinlock in non-EAL thraed Cunming Liang
@ 2015-02-02  2:02         ` Cunming Liang
  2015-02-08 20:01           ` Olivier MATZ
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 15/17] ring: " Cunming Liang
                           ` (4 subsequent siblings)
  18 siblings, 1 reply; 253+ messages in thread
From: Cunming Liang @ 2015-02-02  2:02 UTC (permalink / raw)
  To: dev

For non-EAL thread, bypass per lcore cache, directly use ring pool.
It allows using rte_mempool in either EAL thread or any user pthread.
As in non-EAL thread, it directly rely on rte_ring and it's none preemptive.
It doesn't suggest to run multi-pthread/cpu which compete the rte_mempool.
It will get bad performance and has critical risk if scheduling policy is RT.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_mempool/rte_mempool.h | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 3314651..4845f27 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -198,10 +198,12 @@ struct rte_mempool {
  *   Number to add to the object-oriented statistics.
  */
 #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
-#define __MEMPOOL_STAT_ADD(mp, name, n) do {			\
-		unsigned __lcore_id = rte_lcore_id();		\
-		mp->stats[__lcore_id].name##_objs += n;		\
-		mp->stats[__lcore_id].name##_bulk += 1;		\
+#define __MEMPOOL_STAT_ADD(mp, name, n) do {                    \
+		unsigned __lcore_id = rte_lcore_id();           \
+		if (__lcore_id < RTE_MAX_LCORE) {               \
+			mp->stats[__lcore_id].name##_objs += n;	\
+			mp->stats[__lcore_id].name##_bulk += 1;	\
+		}                                               \
 	} while(0)
 #else
 #define __MEMPOOL_STAT_ADD(mp, name, n) do {} while(0)
@@ -767,8 +769,9 @@ __mempool_put_bulk(struct rte_mempool *mp, void * const *obj_table,
 	__MEMPOOL_STAT_ADD(mp, put, n);
 
 #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0
-	/* cache is not enabled or single producer */
-	if (unlikely(cache_size == 0 || is_mp == 0))
+	/* cache is not enabled or single producer or none EAL thread */
+	if (unlikely(cache_size == 0 || is_mp == 0 ||
+		     lcore_id >= RTE_MAX_LCORE))
 		goto ring_enqueue;
 
 	/* Go straight to ring if put would overflow mem allocated for cache */
@@ -952,7 +955,8 @@ __mempool_get_bulk(struct rte_mempool *mp, void **obj_table,
 	uint32_t cache_size = mp->cache_size;
 
 	/* cache is not enabled or single consumer */
-	if (unlikely(cache_size == 0 || is_mc == 0 || n >= cache_size))
+	if (unlikely(cache_size == 0 || is_mc == 0 ||
+		     n >= cache_size || lcore_id >= RTE_MAX_LCORE))
 		goto ring_dequeue;
 
 	cache = &mp->local_cache[lcore_id];
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v4 15/17] ring: add support to non-EAL thread
  2015-02-02  2:02       ` [dpdk-dev] [PATCH v4 00/17] support multi-pthread per core Cunming Liang
                           ` (13 preceding siblings ...)
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 14/17] mempool: add support to non-EAL thread Cunming Liang
@ 2015-02-02  2:02         ` Cunming Liang
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 16/17] ring: add sched_yield to avoid spin forever Cunming Liang
                           ` (3 subsequent siblings)
  18 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-02-02  2:02 UTC (permalink / raw)
  To: dev

ring debug stat won't take care non-EAL thread.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_ring/rte_ring.h | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index 7cd5f2d..39bacdd 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -188,10 +188,12 @@ struct rte_ring {
  *   The number to add to the object-oriented statistics.
  */
 #ifdef RTE_LIBRTE_RING_DEBUG
-#define __RING_STAT_ADD(r, name, n) do {		\
-		unsigned __lcore_id = rte_lcore_id();	\
-		r->stats[__lcore_id].name##_objs += n;	\
-		r->stats[__lcore_id].name##_bulk += 1;	\
+#define __RING_STAT_ADD(r, name, n) do {                        \
+		unsigned __lcore_id = rte_lcore_id();           \
+		if (__lcore_id < RTE_MAX_LCORE) {               \
+			r->stats[__lcore_id].name##_objs += n;  \
+			r->stats[__lcore_id].name##_bulk += 1;  \
+		}                                               \
 	} while(0)
 #else
 #define __RING_STAT_ADD(r, name, n) do {} while(0)
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v4 16/17] ring: add sched_yield to avoid spin forever
  2015-02-02  2:02       ` [dpdk-dev] [PATCH v4 00/17] support multi-pthread per core Cunming Liang
                           ` (14 preceding siblings ...)
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 15/17] ring: " Cunming Liang
@ 2015-02-02  2:02         ` Cunming Liang
  2015-02-06 15:19           ` Olivier MATZ
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 17/17] timer: add support to non-EAL thread Cunming Liang
                           ` (2 subsequent siblings)
  18 siblings, 1 reply; 253+ messages in thread
From: Cunming Liang @ 2015-02-02  2:02 UTC (permalink / raw)
  To: dev

Add a sched_yield() syscall if the thread spins for too long, waiting other thread to finish its operations on the ring.
That gives pre-empted thread a chance to proceed and finish with ring enqnue/dequeue operation.
The purpose is to reduce contention on the ring.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_ring/rte_ring.h | 35 +++++++++++++++++++++++++++++------
 1 file changed, 29 insertions(+), 6 deletions(-)

diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index 39bacdd..c402c73 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -126,6 +126,7 @@ struct rte_ring_debug_stats {
 
 #define RTE_RING_NAMESIZE 32 /**< The maximum length of a ring name. */
 #define RTE_RING_MZ_PREFIX "RG_"
+#define RTE_RING_PAUSE_REP 0x100  /**< yield after num of times pause. */
 
 /**
  * An RTE ring structure.
@@ -410,7 +411,7 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	uint32_t cons_tail, free_entries;
 	const unsigned max = n;
 	int success;
-	unsigned i;
+	unsigned i, rep;
 	uint32_t mask = r->prod.mask;
 	int ret;
 
@@ -468,8 +469,19 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	 * If there are other enqueues in progress that preceded us,
 	 * we need to wait for them to complete
 	 */
-	while (unlikely(r->prod.tail != prod_head))
-		rte_pause();
+	do {
+		/* avoid spin too long waiting for other thread finish */
+		for (rep = RTE_RING_PAUSE_REP;
+		     rep != 0 && r->prod.tail != prod_head; rep--)
+			rte_pause();
+
+		/*
+		 * It gives pre-empted thread a chance to proceed and
+		 * finish with ring enqnue operation.
+		 */
+		if (rep == 0)
+			sched_yield();
+	} while (rep == 0);
 
 	r->prod.tail = prod_next;
 	return ret;
@@ -589,7 +601,7 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 	uint32_t cons_next, entries;
 	const unsigned max = n;
 	int success;
-	unsigned i;
+	unsigned i, rep;
 	uint32_t mask = r->prod.mask;
 
 	/* move cons.head atomically */
@@ -634,8 +646,19 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 	 * If there are other dequeues in progress that preceded us,
 	 * we need to wait for them to complete
 	 */
-	while (unlikely(r->cons.tail != cons_head))
-		rte_pause();
+	do {
+		/* avoid spin too long waiting for other thread finish */
+		for (rep = RTE_RING_PAUSE_REP;
+		     rep != 0 && r->cons.tail != cons_head; rep--)
+			rte_pause();
+
+		/*
+		 * It gives pre-empted thread a chance to proceed and
+		 * finish with ring denqnue operation.
+		 */
+		if (rep == 0)
+			sched_yield();
+	} while (rep == 0);
 
 	__RING_STAT_ADD(r, deq_success, n);
 	r->cons.tail = cons_next;
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v4 17/17] timer: add support to non-EAL thread
  2015-02-02  2:02       ` [dpdk-dev] [PATCH v4 00/17] support multi-pthread per core Cunming Liang
                           ` (15 preceding siblings ...)
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 16/17] ring: add sched_yield to avoid spin forever Cunming Liang
@ 2015-02-02  2:02         ` Cunming Liang
  2015-02-10 17:45           ` Olivier MATZ
  2015-02-06 15:47         ` [dpdk-dev] [PATCH v4 00/17] support multi-pthread per core Olivier MATZ
  2015-02-12  8:16         ` [dpdk-dev] [PATCH v5 00/19] " Cunming Liang
  18 siblings, 1 reply; 253+ messages in thread
From: Cunming Liang @ 2015-02-02  2:02 UTC (permalink / raw)
  To: dev

Allow to setup timers only for EAL (lcore) threads (__lcore_id < MAX_LCORE_ID).
E.g. – dynamically created thread will be able to reset/stop timer for lcore thread,
but it will be not allowed to setup timer for itself or another non-lcore thread.
rte_timer_manage() for non-lcore thread would simply do nothing and return straightway.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_timer/rte_timer.c | 40 +++++++++++++++++++++++++++++++---------
 lib/librte_timer/rte_timer.h |  2 +-
 2 files changed, 32 insertions(+), 10 deletions(-)

diff --git a/lib/librte_timer/rte_timer.c b/lib/librte_timer/rte_timer.c
index 269a992..601c159 100644
--- a/lib/librte_timer/rte_timer.c
+++ b/lib/librte_timer/rte_timer.c
@@ -79,9 +79,10 @@ static struct priv_timer priv_timer[RTE_MAX_LCORE];
 
 /* when debug is enabled, store some statistics */
 #ifdef RTE_LIBRTE_TIMER_DEBUG
-#define __TIMER_STAT_ADD(name, n) do {				\
-		unsigned __lcore_id = rte_lcore_id();		\
-		priv_timer[__lcore_id].stats.name += (n);	\
+#define __TIMER_STAT_ADD(name, n) do {					\
+		unsigned __lcore_id = rte_lcore_id();			\
+		if (__lcore_id < RTE_MAX_LCORE)				\
+			priv_timer[__lcore_id].stats.name += (n);	\
 	} while(0)
 #else
 #define __TIMER_STAT_ADD(name, n) do {} while(0)
@@ -127,15 +128,26 @@ timer_set_config_state(struct rte_timer *tim,
 	unsigned lcore_id;
 
 	lcore_id = rte_lcore_id();
+	if (lcore_id >= RTE_MAX_LCORE)
+		lcore_id = LCORE_ID_ANY;
 
 	/* wait that the timer is in correct status before update,
 	 * and mark it as being configured */
 	while (success == 0) {
 		prev_status.u32 = tim->status.u32;
 
+		/*
+		 * prevent race condition of non-EAL threads
+		 * to update the timer. When 'owner == LCORE_ID_ANY',
+		 * it means updated by a non-EAL thread.
+		 */
+		if (lcore_id == (unsigned)LCORE_ID_ANY &&
+		    (uint16_t)lcore_id == prev_status.owner)
+			return -1;
+
 		/* timer is running on another core, exit */
 		if (prev_status.state == RTE_TIMER_RUNNING &&
-		    (unsigned)prev_status.owner != lcore_id)
+		    prev_status.owner != (uint16_t)lcore_id)
 			return -1;
 
 		/* timer is being configured on another core */
@@ -366,9 +378,13 @@ __rte_timer_reset(struct rte_timer *tim, uint64_t expire,
 
 	/* round robin for tim_lcore */
 	if (tim_lcore == (unsigned)LCORE_ID_ANY) {
-		tim_lcore = rte_get_next_lcore(priv_timer[lcore_id].prev_lcore,
-					       0, 1);
-		priv_timer[lcore_id].prev_lcore = tim_lcore;
+		if (lcore_id < RTE_MAX_LCORE) {
+			tim_lcore = rte_get_next_lcore(
+				priv_timer[lcore_id].prev_lcore,
+				0, 1);
+			priv_timer[lcore_id].prev_lcore = tim_lcore;
+		} else
+			tim_lcore = rte_get_next_lcore(LCORE_ID_ANY, 0, 1);
 	}
 
 	/* wait that the timer is in correct status before update,
@@ -378,7 +394,8 @@ __rte_timer_reset(struct rte_timer *tim, uint64_t expire,
 		return -1;
 
 	__TIMER_STAT_ADD(reset, 1);
-	if (prev_status.state == RTE_TIMER_RUNNING) {
+	if (prev_status.state == RTE_TIMER_RUNNING &&
+	    lcore_id < RTE_MAX_LCORE) {
 		priv_timer[lcore_id].updated = 1;
 	}
 
@@ -455,7 +472,8 @@ rte_timer_stop(struct rte_timer *tim)
 		return -1;
 
 	__TIMER_STAT_ADD(stop, 1);
-	if (prev_status.state == RTE_TIMER_RUNNING) {
+	if (prev_status.state == RTE_TIMER_RUNNING &&
+	    lcore_id < RTE_MAX_LCORE) {
 		priv_timer[lcore_id].updated = 1;
 	}
 
@@ -499,6 +517,10 @@ void rte_timer_manage(void)
 	uint64_t cur_time;
 	int i, ret;
 
+	/* timer manager only runs on EAL thread */
+	if (lcore_id >= RTE_MAX_LCORE)
+		return;
+
 	__TIMER_STAT_ADD(manage, 1);
 	/* optimize for the case where per-cpu list is empty */
 	if (priv_timer[lcore_id].pending_head.sl_next[0] == NULL)
diff --git a/lib/librte_timer/rte_timer.h b/lib/librte_timer/rte_timer.h
index 4907cf5..5c5df91 100644
--- a/lib/librte_timer/rte_timer.h
+++ b/lib/librte_timer/rte_timer.h
@@ -76,7 +76,7 @@ extern "C" {
 #define RTE_TIMER_RUNNING 2 /**< State: timer function is running. */
 #define RTE_TIMER_CONFIG  3 /**< State: timer is being configured. */
 
-#define RTE_TIMER_NO_OWNER -1 /**< Timer has no owner. */
+#define RTE_TIMER_NO_OWNER -2 /**< Timer has no owner. */
 
 /**
  * Timer type: Periodic or single (one-shot).
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 16/17] ring: add sched_yield to avoid spin forever
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 16/17] ring: add sched_yield to avoid spin forever Cunming Liang
@ 2015-02-06 15:19           ` Olivier MATZ
  2015-02-09 15:43             ` Ananyev, Konstantin
  0 siblings, 1 reply; 253+ messages in thread
From: Olivier MATZ @ 2015-02-06 15:19 UTC (permalink / raw)
  To: Cunming Liang, dev

Hi,

On 02/02/2015 03:02 AM, Cunming Liang wrote:
> Add a sched_yield() syscall if the thread spins for too long, waiting other thread to finish its operations on the ring.
> That gives pre-empted thread a chance to proceed and finish with ring enqnue/dequeue operation.
> The purpose is to reduce contention on the ring.
> 
> Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> ---
>  lib/librte_ring/rte_ring.h | 35 +++++++++++++++++++++++++++++------
>  1 file changed, 29 insertions(+), 6 deletions(-)
> 
> diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
> index 39bacdd..c402c73 100644
> --- a/lib/librte_ring/rte_ring.h
> +++ b/lib/librte_ring/rte_ring.h
> @@ -126,6 +126,7 @@ struct rte_ring_debug_stats {
>  
>  #define RTE_RING_NAMESIZE 32 /**< The maximum length of a ring name. */
>  #define RTE_RING_MZ_PREFIX "RG_"
> +#define RTE_RING_PAUSE_REP 0x100  /**< yield after num of times pause. */
>  
>  /**
>   * An RTE ring structure.
> @@ -410,7 +411,7 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
>  	uint32_t cons_tail, free_entries;
>  	const unsigned max = n;
>  	int success;
> -	unsigned i;
> +	unsigned i, rep;
>  	uint32_t mask = r->prod.mask;
>  	int ret;
>  
> @@ -468,8 +469,19 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
>  	 * If there are other enqueues in progress that preceded us,
>  	 * we need to wait for them to complete
>  	 */
> -	while (unlikely(r->prod.tail != prod_head))
> -		rte_pause();
> +	do {
> +		/* avoid spin too long waiting for other thread finish */
> +		for (rep = RTE_RING_PAUSE_REP;
> +		     rep != 0 && r->prod.tail != prod_head; rep--)
> +			rte_pause();
> +
> +		/*
> +		 * It gives pre-empted thread a chance to proceed and
> +		 * finish with ring enqnue operation.
> +		 */
> +		if (rep == 0)
> +			sched_yield();
> +	} while (rep == 0);
>  
>  	r->prod.tail = prod_next;
>  	return ret;
> @@ -589,7 +601,7 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
>  	uint32_t cons_next, entries;
>  	const unsigned max = n;
>  	int success;
> -	unsigned i;
> +	unsigned i, rep;
>  	uint32_t mask = r->prod.mask;
>  
>  	/* move cons.head atomically */
> @@ -634,8 +646,19 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
>  	 * If there are other dequeues in progress that preceded us,
>  	 * we need to wait for them to complete
>  	 */
> -	while (unlikely(r->cons.tail != cons_head))
> -		rte_pause();
> +	do {
> +		/* avoid spin too long waiting for other thread finish */
> +		for (rep = RTE_RING_PAUSE_REP;
> +		     rep != 0 && r->cons.tail != cons_head; rep--)
> +			rte_pause();
> +
> +		/*
> +		 * It gives pre-empted thread a chance to proceed and
> +		 * finish with ring denqnue operation.
> +		 */
> +		if (rep == 0)
> +			sched_yield();
> +	} while (rep == 0);
>  
>  	__RING_STAT_ADD(r, deq_success, n);
>  	r->cons.tail = cons_next;
> 

The ring library was designed with the assumption that the code is not
preemptable. The code is lock-less but not wait-less. Actually, if the
code is preempted at a bad moment, it can spin forever until it's
unscheduled.

I wonder if adding a sched_yield() may not penalize the current
implementations that only use one pthread per core? Even if there
is only one pthread in the scheduler queue for this CPU, calling
the scheduler code may cost thousands of cycles.

Also, where does this value "RTE_RING_PAUSE_REP 0x100" comes from?
Why 0x100 is better than 42 or than 10000?

I think it could be good to check if there is a performance impact
with this change, especially where there is a lot of contention on
the ring. If it has an impact, what about adding a compile or runtime
option?


Regards,
Olivier

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 00/17] support multi-pthread per core
  2015-02-02  2:02       ` [dpdk-dev] [PATCH v4 00/17] support multi-pthread per core Cunming Liang
                           ` (16 preceding siblings ...)
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 17/17] timer: add support to non-EAL thread Cunming Liang
@ 2015-02-06 15:47         ` Olivier MATZ
  2015-02-06 19:24           ` Robert Sanford
  2015-02-12  8:16         ` [dpdk-dev] [PATCH v5 00/19] " Cunming Liang
  18 siblings, 1 reply; 253+ messages in thread
From: Olivier MATZ @ 2015-02-06 15:47 UTC (permalink / raw)
  To: Cunming Liang, dev

Hi,

On 02/02/2015 03:02 AM, Cunming Liang wrote:
> v4 changes:
>   new patch fixing strnlen() invalid return in 32bit icc [03/17]
>   update and add more comments on sched_yield() [16/17]
> 
> v3 changes:
>   new patch adding sched_yield() in rte_ring to avoid long spin [16/17]
> 
> v2 changes:
>   add '<number>-<number>' support for EAL option '--lcores' [02/17]
> 
> The patch series contain the enhancements of EAL and fixes for libraries
> to run multi-pthreads(either EAL or non-EAL thread) per physical core.
> Two major changes list as below:
> - Extend the core affinity of each EAL thread to 1:n.
>   Each lcore stands for a EAL thread rather than a logical core.
>   The change adds new EAL option to allow static lcore to cpuset assginment.
>   Then a lcore(EAL thread) affinity to a cpuset, original 1:1 mapping is the special case.
> - Fix the libraries to allow running on any non-EAL thread.
>   It fix the gaps running libraries in non-EAL thread(dynamic created by user).
>   Each fix libraries take care the case of rte_lcore_id() >= RTE_MAX_LCORE.

Sorry if I missed something, but after reading the mailing list threads
about this subject, I cannot find an explanation about what problem
this series try to solve.

Can you give some details about which use-case require to have multiple
pthreads per cpu? What are the advantage of doing so?

Regards,
Olivier

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 00/17] support multi-pthread per core
  2015-02-06 15:47         ` [dpdk-dev] [PATCH v4 00/17] support multi-pthread per core Olivier MATZ
@ 2015-02-06 19:24           ` Robert Sanford
  2015-02-06 19:59             ` Olivier MATZ
  0 siblings, 1 reply; 253+ messages in thread
From: Robert Sanford @ 2015-02-06 19:24 UTC (permalink / raw)
  To: Olivier MATZ; +Cc: dev

On Fri, Feb 6, 2015 at 10:47 AM, Olivier MATZ wrote:

> Hi,
>
> On 02/02/2015 03:02 AM, Cunming Liang wrote:
> > v4 changes:
> >   new patch fixing strnlen() invalid return in 32bit icc [03/17]
> >   update and add more comments on sched_yield() [16/17]
> >
> > v3 changes:
> >   new patch adding sched_yield() in rte_ring to avoid long spin [16/17]
> >
> > v2 changes:
> >   add '<number>-<number>' support for EAL option '--lcores' [02/17]
> >
> > The patch series contain the enhancements of EAL and fixes for libraries
> > to run multi-pthreads(either EAL or non-EAL thread) per physical core.
> > Two major changes list as below:
> > - Extend the core affinity of each EAL thread to 1:n.
> >   Each lcore stands for a EAL thread rather than a logical core.
> >   The change adds new EAL option to allow static lcore to cpuset
> assginment.
> >   Then a lcore(EAL thread) affinity to a cpuset, original 1:1 mapping is
> the special case.
> > - Fix the libraries to allow running on any non-EAL thread.
> >   It fix the gaps running libraries in non-EAL thread(dynamic created by
> user).
> >   Each fix libraries take care the case of rte_lcore_id() >=
> RTE_MAX_LCORE.
>
> Sorry if I missed something, but after reading the mailing list threads
> about this subject, I cannot find an explanation about what problem
> this series try to solve.
>
> Can you give some details about which use-case require to have multiple
> pthreads per cpu? What are the advantage of doing so?
>
> Regards,
> Olivier
>


http://dpdk.org/ml/archives/dev/2014-December/009838.html

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 00/17] support multi-pthread per core
  2015-02-06 19:24           ` Robert Sanford
@ 2015-02-06 19:59             ` Olivier MATZ
  0 siblings, 0 replies; 253+ messages in thread
From: Olivier MATZ @ 2015-02-06 19:59 UTC (permalink / raw)
  To: Robert Sanford; +Cc: dev


On 02/06/2015 08:24 PM, Robert Sanford wrote:
>     Sorry if I missed something, but after reading the mailing list threads
>     about this subject, I cannot find an explanation about what problem
>     this series try to solve.
> 
>     Can you give some details about which use-case require to have multiple
>     pthreads per cpu? What are the advantage of doing so?
> 
>     Regards,
>     Olivier
> 
> http://dpdk.org/ml/archives/dev/2014-December/009838.html

Thanks, indeed I missed it.

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 01/17] eal: add cpuset into per EAL thread lcore_config
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 01/17] eal: add cpuset into per EAL thread lcore_config Cunming Liang
@ 2015-02-08 19:59           ` Olivier MATZ
  2015-02-09 11:33             ` Liang, Cunming
  0 siblings, 1 reply; 253+ messages in thread
From: Olivier MATZ @ 2015-02-08 19:59 UTC (permalink / raw)
  To: Cunming Liang, dev

Hi,

On 02/02/2015 03:02 AM, Cunming Liang wrote:
> The patch adds 'cpuset' into per-lcore configure 'lcore_config[]',
> as the lcore no longer always 1:1 pinning with physical cpu.
> The lcore now stands for a EAL thread rather than a logical cpu.
> 
> It doesn't change the default behavior of 1:1 mapping, but allows to
> affinity the EAL thread to multiple cpus.
> 
> [...]
> diff --git a/lib/librte_eal/bsdapp/eal/eal_memory.c b/lib/librte_eal/bsdapp/eal/eal_memory.c
> index 65ee87d..a34d500 100644
> --- a/lib/librte_eal/bsdapp/eal/eal_memory.c
> +++ b/lib/librte_eal/bsdapp/eal/eal_memory.c
> @@ -45,6 +45,8 @@
>  #include "eal_internal_cfg.h"
>  #include "eal_filesystem.h"
>  
> +/* avoid re-defined against with freebsd header */
> +#undef PAGE_SIZE
>  #define PAGE_SIZE (sysconf(_SC_PAGESIZE))

I don't see the link with the patch. Should this go somewhere else?


>  
>  /*
> diff --git a/lib/librte_eal/common/include/rte_lcore.h b/lib/librte_eal/common/include/rte_lcore.h
> index 49b2c03..4c7d6bb 100644
> --- a/lib/librte_eal/common/include/rte_lcore.h
> +++ b/lib/librte_eal/common/include/rte_lcore.h
> @@ -50,6 +50,13 @@ extern "C" {
>  
>  #define LCORE_ID_ANY -1    /**< Any lcore. */
>  
> +#if defined(__linux__)
> +	typedef	cpu_set_t rte_cpuset_t;
> +#elif defined(__FreeBSD__)
> +#include <pthread_np.h>
> +	typedef cpuset_t rte_cpuset_t;
> +#endif
> +

Should we also define RTE_CPU_SETSIZE?
For linux, should <sched.h> be included?

If I understand well, after the patch series, the user of
rte_thread_set_affinity() and rte_thread_get_affinity() are
supposed to use the macros from sched.h to access to this
cpuset parameter. So I'm wondering if it's not better to
use cpu_set_t from libc instead of redefining rte_cpuset_t.

To reword my question: what is the purpose of redefining
cpu_set_t in rte_cpuset_t if we still need to use all the
libc API to access to it?


Regards,
Olivier

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 02/17] eal: new eal option '--lcores' for cpu assignment
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 02/17] eal: new eal option '--lcores' for cpu assignment Cunming Liang
@ 2015-02-08 19:59           ` Olivier MATZ
  2015-02-09 11:45             ` Liang, Cunming
  0 siblings, 1 reply; 253+ messages in thread
From: Olivier MATZ @ 2015-02-08 19:59 UTC (permalink / raw)
  To: Cunming Liang, dev

Hi,

On 02/02/2015 03:02 AM, Cunming Liang wrote:
> It supports one new eal long option '--lcores' for EAL thread cpuset assignment.
> 
> The format pattern:
> 	--lcores='lcores[@cpus]<,lcores[@cpus]>'
> lcores, cpus could be a single digit/range or a group.
> '(' and ')' are necessary if it's a group.
> If not supply '@cpus', the value of cpus uses the same as lcores.
> 
> e.g. '1,2@(5-7),(3-5)@(0,2),(0,6),7-8' means starting 9 EAL thread as below
>   lcore 0 runs on cpuset 0x41 (cpu 0,6)
>   lcore 1 runs on cpuset 0x2 (cpu 1)
>   lcore 2 runs on cpuset 0xe0 (cpu 5,6,7)
>   lcore 3,4,5 runs on cpuset 0x5 (cpu 0,2)
>   lcore 6 runs on cpuset 0x41 (cpu 0,6)
>   lcore 7 runs on cpuset 0x80 (cpu 7)
>   lcore 8 runs on cpuset 0x100 (cpu 8)
> 
> Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> ---
>  lib/librte_eal/common/eal_common_launch.c  |   1 -
>  lib/librte_eal/common/eal_common_options.c | 300 ++++++++++++++++++++++++++++-
>  lib/librte_eal/common/eal_options.h        |   2 +
>  lib/librte_eal/linuxapp/eal/Makefile       |   1 +
>  4 files changed, 299 insertions(+), 5 deletions(-)
> 
> diff --git a/lib/librte_eal/common/eal_common_launch.c b/lib/librte_eal/common/eal_common_launch.c
> index 599f83b..2d732b1 100644
> --- a/lib/librte_eal/common/eal_common_launch.c
> +++ b/lib/librte_eal/common/eal_common_launch.c
> @@ -117,4 +117,3 @@ rte_eal_mp_wait_lcore(void)
>  		rte_eal_wait_lcore(lcore_id);
>  	}
>  }
> -


This line should be removed from the patch.


> diff --git a/lib/librte_eal/common/eal_common_options.c b/lib/librte_eal/common/eal_common_options.c
> index 67e02dc..29ebb6f 100644
> --- a/lib/librte_eal/common/eal_common_options.c
> +++ b/lib/librte_eal/common/eal_common_options.c
> @@ -45,6 +45,7 @@
>  #include <rte_lcore.h>
>  #include <rte_version.h>
>  #include <rte_devargs.h>
> +#include <rte_memcpy.h>
>  
>  #include "eal_internal_cfg.h"
>  #include "eal_options.h"
> @@ -85,6 +86,7 @@ eal_long_options[] = {
>  	{OPT_XEN_DOM0, 0, 0, OPT_XEN_DOM0_NUM},
>  	{OPT_CREATE_UIO_DEV, 1, NULL, OPT_CREATE_UIO_DEV_NUM},
>  	{OPT_VFIO_INTR, 1, NULL, OPT_VFIO_INTR_NUM},
> +	{OPT_LCORES, 1, 0, OPT_LCORES_NUM},
>  	{0, 0, 0, 0}
>  };
>  
> @@ -255,9 +257,11 @@ eal_parse_corelist(const char *corelist)
>  			if (min == RTE_MAX_LCORE)
>  				min = idx;
>  			for (idx = min; idx <= max; idx++) {
> -				cfg->lcore_role[idx] = ROLE_RTE;
> -				lcore_config[idx].core_index = count;
> -				count++;
> +				if (cfg->lcore_role[idx] != ROLE_RTE) {
> +					cfg->lcore_role[idx] = ROLE_RTE;
> +					lcore_config[idx].core_index = count;
> +					count++;
> +				}
>  			}
>  			min = RTE_MAX_LCORE;
>  		} else
> @@ -292,6 +296,279 @@ eal_parse_master_lcore(const char *arg)
>  	return 0;
>  }
>  
> +/*
> + * Parse elem, the elem could be single number/range or '(' ')' group
> + * Within group elem, '-' used for a range seperator;
> + *                    ',' used for a single number.
> + */
> +static int
> +eal_parse_set(const char *input, uint16_t set[], unsigned num)

It's not very clear what elem is. Maybe it could be a bit reworded.
What about naming the function "eal_parse_cpuset()" instead?


> +{
> +	unsigned idx;
> +	const char *str = input;
> +	char *end = NULL;
> +	unsigned min, max;
> +
> +	memset(set, 0, num * sizeof(uint16_t));
> +
> +	while (isblank(*str))
> +		str++;
> +
> +	/* only digit or left bracket is qulify for start point */
> +	if ((!isdigit(*str) && *str != '(') || *str == '\0')
> +		return -1;
> +
> +	/* process single number or single range of number */
> +	if (*str != '(') {
> +		errno = 0;
> +		idx = strtoul(str, &end, 10);
> +		if (errno || end == NULL || idx >= num)
> +			return -1;
> +		else {
> +			while (isblank(*end))
> +				end++;
> +
> +			min = idx;
> +			max = idx;
> +			if (*end == '-') {
> +				/* proccess single <number>-<number> */
> +				end++;
> +				while (isblank(*end))
> +					end++;
> +				if (!isdigit(*end))
> +					return -1;
> +
> +				errno = 0;
> +				idx = strtoul(end, &end, 10);
> +				if (errno || end == NULL || idx >= num)
> +					return -1;
> +				max = idx;
> +				while (isblank(*end))
> +					end++;
> +				if (*end != ',' && *end != '\0')
> +					return -1;
> +			}
> +
> +			if (*end != ',' && *end != '\0' &&
> +			    *end != '@')
> +				return -1;
> +
> +			for (idx = RTE_MIN(min, max);
> +			     idx <= RTE_MAX(min, max); idx++)
> +				set[idx] = 1;
> +
> +			return end - input;
> +		}
> +	}
> +
> +	/* process set within bracket */
> +	str++;
> +	while (isblank(*str))
> +		str++;
> +	if (*str == '\0')
> +		return -1;
> +
> +	min = RTE_MAX_LCORE;
> +	do {
> +
> +		/* go ahead to the first digit */
> +		while (isblank(*str))
> +			str++;
> +		if (!isdigit(*str))
> +			return -1;
> +
> +		/* get the digit value */
> +		errno = 0;
> +		idx = strtoul(str, &end, 10);
> +		if (errno || end == NULL || idx >= num)
> +			return -1;
> +
> +		/* go ahead to separator '-',',' and ')' */
> +		while (isblank(*end))
> +			end++;
> +		if (*end == '-') {
> +			if (min == RTE_MAX_LCORE)
> +				min = idx;
> +			else /* avoid continuous '-' */
> +				return -1;
> +		} else if ((*end == ',') || (*end == ')')) {
> +			max = idx;
> +			if (min == RTE_MAX_LCORE)
> +				min = idx;
> +			for (idx = RTE_MIN(min, max);
> +			     idx <= RTE_MAX(min, max); idx++)
> +				set[idx] = 1;
> +
> +			min = RTE_MAX_LCORE;
> +		} else
> +			return -1;
> +
> +		str = end + 1;
> +	} while (*end != '\0' && *end != ')');
> +
> +	return str - input;
> +}

In the function above, there are some typos in the comments
 seperator -> separator
 qulify -> qualify
 proccess -> process

> +
> +/* convert from set array to cpuset bitmap */
> +static inline int
> +convert_to_cpuset(rte_cpuset_t *cpusetp,
> +	      uint16_t *set, unsigned num)

I don't think the function should be inlined



Regards,
Olivier

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 03/17] eal: fix wrong strnlen() return value in 32bit icc
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 03/17] eal: fix wrong strnlen() return value in 32bit icc Cunming Liang
@ 2015-02-08 19:59           ` Olivier MATZ
  2015-02-09 11:57             ` Liang, Cunming
  0 siblings, 1 reply; 253+ messages in thread
From: Olivier MATZ @ 2015-02-08 19:59 UTC (permalink / raw)
  To: Cunming Liang, dev

Hi,

On 02/02/2015 03:02 AM, Cunming Liang wrote:
> The problem is that strnlen() here may return invalid value with 32bit icc.
> (actually it returns it’s second parameter,e.g: sysconf(_SC_ARG_MAX)).
> It starts to manifest hwen max_len parameter is > 2M and using icc –m32 –O2 (or above).
> 
> Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
> Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> ---
>  lib/librte_eal/common/eal_common_options.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/librte_eal/common/eal_common_options.c b/lib/librte_eal/common/eal_common_options.c
> index 29ebb6f..22d5d37 100644
> --- a/lib/librte_eal/common/eal_common_options.c
> +++ b/lib/librte_eal/common/eal_common_options.c
> @@ -227,7 +227,7 @@ eal_parse_corelist(const char *corelist)
>  	/* Remove all blank characters ahead and after */
>  	while (isblank(*corelist))
>  		corelist++;
> -	i = strnlen(corelist, sysconf(_SC_ARG_MAX));
> +	i = strnlen(corelist, PATH_MAX);
>  	while ((i > 0) && isblank(corelist[i - 1]))
>  		i--;
>  
> @@ -469,7 +469,7 @@ eal_parse_lcores(const char *lcores)
>  	/* Remove all blank characters ahead and after */
>  	while (isblank(*lcores))
>  		lcores++;
> -	i = strnlen(lcores, sysconf(_SC_ARG_MAX));
> +	i = strnlen(lcores, PATH_MAX);
>  	while ((i > 0) && isblank(lcores[i - 1]))
>  		i--;
>  
> 

I think PATH_MAX is not equivalent to _SC_ARG_MAX.

But the main question is: why do we need to use strnlen() here instead
of strlen? We can expect that argv[] pointers are always nul-terminated.
Replacing them by strlen() would probably also solve the icc issue.

Regards,
Olivier

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 04/17] eal: add support parsing socket_id from cpuset
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 04/17] eal: add support parsing socket_id from cpuset Cunming Liang
@ 2015-02-08 20:00           ` Olivier MATZ
  2015-02-09 12:26             ` Liang, Cunming
  0 siblings, 1 reply; 253+ messages in thread
From: Olivier MATZ @ 2015-02-08 20:00 UTC (permalink / raw)
  To: Cunming Liang, dev

Hi,

On 02/02/2015 03:02 AM, Cunming Liang wrote:
> It returns the socket_id if all cpus in the cpuset belongs
> to the same NUMA node, otherwise it will return SOCKET_ID_ANY.
> 
> Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> ---
>  lib/librte_eal/bsdapp/eal/eal_lcore.c   |  7 +++++
>  lib/librte_eal/common/eal_thread.h      | 52 +++++++++++++++++++++++++++++++++
>  lib/librte_eal/linuxapp/eal/eal_lcore.c |  7 +++++
>  3 files changed, 66 insertions(+)
> 
> diff --git a/lib/librte_eal/bsdapp/eal/eal_lcore.c b/lib/librte_eal/bsdapp/eal/eal_lcore.c
> index 72f8ac2..162fb4f 100644
> --- a/lib/librte_eal/bsdapp/eal/eal_lcore.c
> +++ b/lib/librte_eal/bsdapp/eal/eal_lcore.c
> @@ -41,6 +41,7 @@
>  #include <rte_debug.h>
>  
>  #include "eal_private.h"
> +#include "eal_thread.h"
>  
>  /* No topology information available on FreeBSD including NUMA info */
>  #define cpu_core_id(X) 0
> @@ -112,3 +113,9 @@ rte_eal_cpu_init(void)
>  
>  	return 0;
>  }
> +
> +unsigned
> +eal_cpu_socket_id(__rte_unused unsigned cpu_id)
> +{
> +	return cpu_socket_id(cpu_id);
> +}
> diff --git a/lib/librte_eal/common/eal_thread.h b/lib/librte_eal/common/eal_thread.h
> index b53b84d..a25ee86 100644
> --- a/lib/librte_eal/common/eal_thread.h
> +++ b/lib/librte_eal/common/eal_thread.h
> @@ -34,6 +34,10 @@
>  #ifndef EAL_THREAD_H
>  #define EAL_THREAD_H
>  
> +#include <sched.h>
> +
> +#include <rte_debug.h>
> +
>  /**
>   * basic loop of thread, called for each thread by eal_init().
>   *
> @@ -50,4 +54,52 @@ __attribute__((noreturn)) void *eal_thread_loop(void *arg);
>   */
>  void eal_thread_init_master(unsigned lcore_id);
>  
> +/**
> + * Get the NUMA socket id from cpu id.
> + * This function is private to EAL.
> + *
> + * @param cpu_id
> + *   The logical process id.
> + * @return
> + *   socket_id or SOCKET_ID_ANY
> + */
> +unsigned eal_cpu_socket_id(unsigned cpu_id);

Wouldn't it be better to rename the existing function cpu_socket_id()
in eal_cpu_socket_id() and export it in eal_thread.h?

In case of bsd where cpu_socket_id() is implemented using a #define,
a new function should be created returning 0.


> +
> +/**
> + * Get the NUMA socket id from cpuset.
> + * This function is private to EAL.
> + *
> + * @param cpusetp
> + *   The point to a valid cpu set.
> + * @return
> + *   socket_id or SOCKET_ID_ANY
> + */
> +static inline int
> +eal_cpuset_socket_id(rte_cpuset_t *cpusetp)
> +{
> +	unsigned cpu = 0;
> +	int socket_id = SOCKET_ID_ANY;
> +	int sid;
> +
> +	if (cpusetp == NULL)
> +		return SOCKET_ID_ANY;

SOCKET_ID_ANY is not defined, maybe <rte_lcore.h> should be included
somewhere.

> +
> +	do {
> +		if (!CPU_ISSET(cpu, cpusetp))
> +			continue;
> +
> +		if (socket_id == SOCKET_ID_ANY)
> +			socket_id = eal_cpu_socket_id(cpu);
> +
> +		sid = eal_cpu_socket_id(cpu);
> +		if (socket_id != sid) {
> +			socket_id = SOCKET_ID_ANY;
> +			break;
> +		}
> +
> +	} while (++cpu < RTE_MAX_LCORE);
> +
> +	return socket_id;
> +}


I don't think this function should be inlined.

As this function is not used, it could be interesting for reviewers
to understand when

> +
>  #endif /* EAL_THREAD_H */
> diff --git a/lib/librte_eal/linuxapp/eal/eal_lcore.c b/lib/librte_eal/linuxapp/eal/eal_lcore.c
> index 29615f8..922af6d 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_lcore.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_lcore.c
> @@ -45,6 +45,7 @@
>  
>  #include "eal_private.h"
>  #include "eal_filesystem.h"
> +#include "eal_thread.h"
>  
>  #define SYS_CPU_DIR "/sys/devices/system/cpu/cpu%u"
>  #define CORE_ID_FILE "topology/core_id"
> @@ -197,3 +198,9 @@ rte_eal_cpu_init(void)
>  
>  	return 0;
>  }
> +
> +unsigned
> +eal_cpu_socket_id(unsigned cpu_id)
> +{
> +	return cpu_socket_id(cpu_id);
> +}
> 

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 05/17] eal: new TLS definition and API declaration
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 05/17] eal: new TLS definition and API declaration Cunming Liang
@ 2015-02-08 20:00           ` Olivier MATZ
  2015-02-09 12:45             ` Liang, Cunming
  0 siblings, 1 reply; 253+ messages in thread
From: Olivier MATZ @ 2015-02-08 20:00 UTC (permalink / raw)
  To: Cunming Liang, dev

Hi,

On 02/02/2015 03:02 AM, Cunming Liang wrote:
> 1. add two TLS *_socket_id* and *_cpuset*
> 2. add two external API rte_thread_set/get_affinity
> 3. add one internal API eal_thread_dump_affinity

To me, it's a bit strage to add an API withtout the associated code.
Maybe you have a good reason to do that, but I think in this case it
should be explained in the commit log.

> 
> Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> ---
>  lib/librte_eal/bsdapp/eal/eal_thread.c    |  2 ++
>  lib/librte_eal/common/eal_thread.h        | 14 ++++++++++++++
>  lib/librte_eal/common/include/rte_lcore.h | 29 +++++++++++++++++++++++++++--
>  lib/librte_eal/linuxapp/eal/eal_thread.c  |  2 ++
>  4 files changed, 45 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c b/lib/librte_eal/bsdapp/eal/eal_thread.c
> index ab05368..10220c7 100644
> --- a/lib/librte_eal/bsdapp/eal/eal_thread.c
> +++ b/lib/librte_eal/bsdapp/eal/eal_thread.c
> @@ -56,6 +56,8 @@
>  #include "eal_thread.h"
>  
>  RTE_DEFINE_PER_LCORE(unsigned, _lcore_id);
> +RTE_DEFINE_PER_LCORE(unsigned, _socket_id);
> +RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset);
>  
>  /*
>   * Send a message to a slave lcore identified by slave_id to call a
> diff --git a/lib/librte_eal/common/eal_thread.h b/lib/librte_eal/common/eal_thread.h
> index a25ee86..28edf51 100644
> --- a/lib/librte_eal/common/eal_thread.h
> +++ b/lib/librte_eal/common/eal_thread.h
> @@ -102,4 +102,18 @@ eal_cpuset_socket_id(rte_cpuset_t *cpusetp)
>  	return socket_id;
>  }
>  
> +/**
> + * Dump the current pthread cpuset.
> + * This function is private to EAL.
> + *
> + * @param str
> + *   The string buffer the cpuset will dump to.
> + * @param size
> + *   The string buffer size.
> + */
> +#define CPU_STR_LEN            256
> +void
> +eal_thread_dump_affinity(char str[], unsigned size);

Although it's equivalent for function arguments, I think "char *str" is
usually preferred over "char str[]". See for instance in snprintf() or
fgets().

What is the purpose of CPU_STR_LEN?

What occurs if the size of the dump is greater than the size of the
given buffer? Is the string truncated? Is there a \0 at the end?
This should be described in the API comments. Maybe adding a return
value could help the user to determine if the string was truncated.

> +
> +
>  #endif /* EAL_THREAD_H */
> diff --git a/lib/librte_eal/common/include/rte_lcore.h b/lib/librte_eal/common/include/rte_lcore.h
> index 4c7d6bb..facdbdc 100644
> --- a/lib/librte_eal/common/include/rte_lcore.h
> +++ b/lib/librte_eal/common/include/rte_lcore.h
> @@ -43,6 +43,7 @@
>  #include <rte_per_lcore.h>
>  #include <rte_eal.h>
>  #include <rte_launch.h>
> +#include <rte_memory.h>
>  
>  #ifdef __cplusplus
>  extern "C" {
> @@ -80,7 +81,9 @@ struct lcore_config {
>   */
>  extern struct lcore_config lcore_config[RTE_MAX_LCORE];
>  
> -RTE_DECLARE_PER_LCORE(unsigned, _lcore_id); /**< Per core "core id". */
> +RTE_DECLARE_PER_LCORE(unsigned, _lcore_id);  /**< Per thread "lcore id". */
> +RTE_DECLARE_PER_LCORE(unsigned, _socket_id); /**< Per thread "socket id". */
> +RTE_DECLARE_PER_LCORE(rte_cpuset_t, _cpuset); /**< Per thread "cpuset". */
>  
>  /**
>   * Return the ID of the execution unit we are running on.
> @@ -146,7 +149,7 @@ rte_lcore_index(int lcore_id)
>  static inline unsigned
>  rte_socket_id(void)
>  {
> -	return lcore_config[rte_lcore_id()].socket_id;
> +	return RTE_PER_LCORE(_socket_id);
>  }

I don't see where the _socket_id variable is assigned. I think there
is probably an issue with the splitting of the patches.

Regards,
Olivier

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 06/17] eal: add eal_common_thread.c for common thread API
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 06/17] eal: add eal_common_thread.c for common thread API Cunming Liang
@ 2015-02-08 20:00           ` Olivier MATZ
  2015-02-09 13:12             ` Liang, Cunming
  0 siblings, 1 reply; 253+ messages in thread
From: Olivier MATZ @ 2015-02-08 20:00 UTC (permalink / raw)
  To: Cunming Liang, dev

Hi,

On 02/02/2015 03:02 AM, Cunming Liang wrote:
> The API works for both EAL thread and none EAL thread.
> When calling rte_thread_set_affinity, the *_socket_id* and
> *_cpuset* of calling thread will be updated if the thread
> successful set the cpu affinity.
> 
> [...]
> +int
> +rte_thread_set_affinity(rte_cpuset_t *cpusetp)
> +{
> +	int s;
> +	unsigned lcore_id;
> +	pthread_t tid;
> +
> +	if (!cpusetp)
> +		return -1;

Is it really needed to test that cpusetp is not NULL?

> +
> +	lcore_id = rte_lcore_id();
> +	if (lcore_id != (unsigned)LCORE_ID_ANY) {

This is strange to see something that cannot happen:
lcore_id == LCORE_ID_ANY is only possible after your patch is 12/17
is added. Maybe it can be reordered to avoid this inconsistency?

> +		/* EAL thread */
> +		tid = lcore_config[lcore_id].thread_id;
> +
> +		s = pthread_setaffinity_np(tid, sizeof(rte_cpuset_t), cpusetp);
> +		if (s != 0) {
> +			RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
> +			return -1;
> +		}
> +
> +		/* store socket_id in TLS for quick access */
> +		RTE_PER_LCORE(_socket_id) =
> +			eal_cpuset_socket_id(cpusetp);
> +
> +		/* store cpuset in TLS for quick access */
> +		rte_memcpy(&RTE_PER_LCORE(_cpuset), cpusetp,
> +			   sizeof(rte_cpuset_t));
> +
> +		/* update lcore_config */
> +		lcore_config[lcore_id].socket_id = RTE_PER_LCORE(_socket_id);
> +		rte_memcpy(&lcore_config[lcore_id].cpuset, cpusetp,
> +			   sizeof(rte_cpuset_t));
> +	} else {
> +		/* none EAL thread */
> +		tid = pthread_self();
> +
> +		s = pthread_setaffinity_np(tid, sizeof(rte_cpuset_t), cpusetp);
> +		if (s != 0) {
> +			RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
> +			return -1;
> +		}
> +
> +		/* store cpuset in TLS for quick access */
> +		rte_memcpy(&RTE_PER_LCORE(_cpuset), cpusetp,
> +			   sizeof(rte_cpuset_t));
> +
> +		/* store socket_id in TLS for quick access */
> +		RTE_PER_LCORE(_socket_id) =
> +			eal_cpuset_socket_id(cpusetp);
> +	}

Why not always using pthread_self() to get the tid?

I think most of the code could be factorized here. The only difference
(which is hard to see as is as code is not exactly ordered in the same
manner) is that the config is updated in case it's an EAL thread.



> +
> +	return 0;
> +}
> +
> +int
> +rte_thread_get_affinity(rte_cpuset_t *cpusetp)
> +{
> +	if (!cpusetp)
> +		return -1;

Same here. This is the only reason why rte_thread_get_affinity() could
fail. Removing this test would allow to change the API to return void
instead. It will avoid a useless test below in
eal_thread_dump_affinity().

> +
> +	rte_memcpy(cpusetp, &RTE_PER_LCORE(_cpuset),
> +		   sizeof(rte_cpuset_t));
> +
> +	return 0;
> +}
> +
> +void
> +eal_thread_dump_affinity(char str[], unsigned size)
> +{
> +	rte_cpuset_t cpuset;
> +	unsigned cpu;
> +	int ret;
> +	unsigned int out = 0;
> +
> +	if (rte_thread_get_affinity(&cpuset) < 0) {
> +		str[0] = '\0';
> +		return;
> +	}

This one could be removed it the (== NULL) test is removed.

> +
> +	for (cpu = 0; cpu < RTE_MAX_LCORE; cpu++) {
> +		if (!CPU_ISSET(cpu, &cpuset))
> +			continue;
> +
> +		ret = snprintf(str + out,
> +			       size - out, "%u,", cpu);
> +		if (ret < 0 || (unsigned)ret >= size - out)
> +			break;

On the contrary, I think here returning an error to the user
would be useful so he can knows that the dump is not complete.


Regards,
Olivier

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 07/17] eal: add rte_gettid() to acquire unique system tid
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 07/17] eal: add rte_gettid() to acquire unique system tid Cunming Liang
@ 2015-02-08 20:00           ` Olivier MATZ
  2015-02-10  6:57             ` Liang, Cunming
  0 siblings, 1 reply; 253+ messages in thread
From: Olivier MATZ @ 2015-02-08 20:00 UTC (permalink / raw)
  To: Cunming Liang, dev

Hi,

On 02/02/2015 03:02 AM, Cunming Liang wrote:
> The rte_gettid() wraps the linux and freebsd syscall gettid().
> It provides a persistent unique thread id for the calling thread.
> It will save the unique id in TLS on the first time.
> 
> [...]
>
> +/**
> + * A wrap API for syscall gettid.
> + *
> + * @return
> + *   On success, returns the thread ID of calling process.
> + *   It always successful.
> + */
> +int rte_sys_gettid(void);
> +
> +/**
> + * Get system unique thread id.
> + *
> + * @return
> + *   On success, returns the thread ID of calling process.
> + *   It always successful.
> + */
> +static inline int rte_gettid(void)
> +{
> +	static RTE_DEFINE_PER_LCORE(int, _thread_id) = -1;
> +	if (RTE_PER_LCORE(_thread_id) == -1)
> +		RTE_PER_LCORE(_thread_id) = rte_sys_gettid();
> +	return RTE_PER_LCORE(_thread_id);
> +}

Instead of doing the test each time rte_gettid() is called, why not
having 2 functions:
  rte_init_tid() -> assign the per_lcore variable
  rte_gettid() -> return the per_lcore variable



Regards,
Olivier

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 08/17] eal: apply affinity of EAL thread by assigned cpuset
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 08/17] eal: apply affinity of EAL thread by assigned cpuset Cunming Liang
@ 2015-02-08 20:00           ` Olivier MATZ
  2015-02-09 13:48             ` Liang, Cunming
  0 siblings, 1 reply; 253+ messages in thread
From: Olivier MATZ @ 2015-02-08 20:00 UTC (permalink / raw)
  To: Cunming Liang, dev

Hi,

On 02/02/2015 03:02 AM, Cunming Liang wrote:
> EAL threads use assigned cpuset to set core affinity during startup.
> It keeps 1:1 mapping, if no '--lcores' option is used.
> 
> [...]
>
>  lib/librte_eal/bsdapp/eal/eal.c          | 13 ++++---
>  lib/librte_eal/bsdapp/eal/eal_thread.c   | 63 +++++++++---------------------
>  lib/librte_eal/linuxapp/eal/eal.c        |  7 +++-
>  lib/librte_eal/linuxapp/eal/eal_thread.c | 67 +++++++++++---------------------
>  4 files changed, 54 insertions(+), 96 deletions(-)
> 
> diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
> index 69f3c03..98c5a83 100644
> --- a/lib/librte_eal/bsdapp/eal/eal.c
> +++ b/lib/librte_eal/bsdapp/eal/eal.c
> @@ -432,6 +432,7 @@ rte_eal_init(int argc, char **argv)
>  	int i, fctret, ret;
>  	pthread_t thread_id;
>  	static rte_atomic32_t run_once = RTE_ATOMIC32_INIT(0);
> +	char cpuset[CPU_STR_LEN];
>  
>  	if (!rte_atomic32_test_and_set(&run_once))
>  		return -1;
> @@ -502,13 +503,17 @@ rte_eal_init(int argc, char **argv)
>  	if (rte_eal_pci_init() < 0)
>  		rte_panic("Cannot init PCI\n");
>  
> -	RTE_LOG(DEBUG, EAL, "Master core %u is ready (tid=%p)\n",
> -		rte_config.master_lcore, thread_id);
> -
>  	eal_check_mem_on_local_socket();
>  
>  	rte_eal_mcfg_complete();
>  
> +	eal_thread_init_master(rte_config.master_lcore);
> +
> +	eal_thread_dump_affinity(cpuset, CPU_STR_LEN);
> +
> +	RTE_LOG(DEBUG, EAL, "Master lcore %u is ready (tid=%p;cpuset=[%s])\n",
> +		rte_config.master_lcore, thread_id, cpuset);
> +
>  	if (rte_eal_dev_init() < 0)
>  		rte_panic("Cannot init pmd devices\n");
>  
> @@ -532,8 +537,6 @@ rte_eal_init(int argc, char **argv)
>  			rte_panic("Cannot create thread\n");
>  	}
>  
> -	eal_thread_init_master(rte_config.master_lcore);
> -
>  	/*
>  	 * Launch a dummy function on all slave lcores, so that master lcore
>  	 * knows they are all ready when this function returns.

I wonder if changing this may have an impact on third-party drivers
that already use a management thread. Before the patch, the init()
function of the external library was called with default affinities,
and now it's called with the affinity from master lcore.

I think it should at least be noticed in the commit log.

Why are you doing this change? (I don't say it's a bad change, but
I don't understand why you are doing it here)


> diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c b/lib/librte_eal/bsdapp/eal/eal_thread.c
> index d0c077b..5b16302 100644
> --- a/lib/librte_eal/bsdapp/eal/eal_thread.c
> +++ b/lib/librte_eal/bsdapp/eal/eal_thread.c
> @@ -103,55 +103,27 @@ eal_thread_set_affinity(void)
>  {
>  	int s;
>  	pthread_t thread;
> -
> -/*
> - * According to the section VERSIONS of the CPU_ALLOC man page:
> - *
> - * The CPU_ZERO(), CPU_SET(), CPU_CLR(), and CPU_ISSET() macros were added
> - * in glibc 2.3.3.
> - *
> - * CPU_COUNT() first appeared in glibc 2.6.
> - *
> - * CPU_AND(),     CPU_OR(),     CPU_XOR(),    CPU_EQUAL(),    CPU_ALLOC(),
> - * CPU_ALLOC_SIZE(), CPU_FREE(), CPU_ZERO_S(),  CPU_SET_S(),  CPU_CLR_S(),
> - * CPU_ISSET_S(),  CPU_AND_S(), CPU_OR_S(), CPU_XOR_S(), and CPU_EQUAL_S()
> - * first appeared in glibc 2.7.
> - */
> -#if defined(CPU_ALLOC)
> -	size_t size;
> -	cpu_set_t *cpusetp;
> -
> -	cpusetp = CPU_ALLOC(RTE_MAX_LCORE);
> -	if (cpusetp == NULL) {
> -		RTE_LOG(ERR, EAL, "CPU_ALLOC failed\n");
> -		return -1;
> -	}
> -
> -	size = CPU_ALLOC_SIZE(RTE_MAX_LCORE);
> -	CPU_ZERO_S(size, cpusetp);
> -	CPU_SET_S(rte_lcore_id(), size, cpusetp);
> +	unsigned lcore_id = rte_lcore_id();
>  
>  	thread = pthread_self();
> -	s = pthread_setaffinity_np(thread, size, cpusetp);
> +	s = pthread_setaffinity_np(thread, sizeof(cpuset_t),
> +				   &lcore_config[lcore_id].cpuset);
>  	if (s != 0) {
>  		RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
> -		CPU_FREE(cpusetp);
>  		return -1;
>  	}
>  
> -	CPU_FREE(cpusetp);
> -#else /* CPU_ALLOC */
> -	cpuset_t cpuset;
> -	CPU_ZERO( &cpuset );
> -	CPU_SET( rte_lcore_id(), &cpuset );
> +	/* acquire system unique id  */
> +	rte_gettid();

As suggested in the previous patch, I think having rte_init_tid() would
be clearer here.

> +
> +	/* store socket_id in TLS for quick access */
> +	RTE_PER_LCORE(_socket_id) =
> +		eal_cpuset_socket_id(&lcore_config[lcore_id].cpuset);
> +
> +	CPU_COPY(&lcore_config[lcore_id].cpuset, &RTE_PER_LCORE(_cpuset));
> +
> +	lcore_config[lcore_id].socket_id = RTE_PER_LCORE(_socket_id);
>  
> -	thread = pthread_self();
> -	s = pthread_setaffinity_np(thread, sizeof( cpuset ), &cpuset);
> -	if (s != 0) {
> -		RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
> -		return -1;
> -	}
> -#endif

You are removing a lot of code that was using CPU_ALLOC().
Are we sure that the cpuset_t type is large enough to store all the
CPUs?

It looks the current value of CPU_SETSIZE is 1024 now, but I wonder
if this code was written when this value was lower. Could you check if
it can happen today (maybe with an old libc)? A problem can occur if
the size of cpuset_t is lower that the size of RTE_MAX_LCORE.


Regards,
Olivier

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 09/17] enic: fix re-define freebsd compile complain
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 09/17] enic: fix re-define freebsd compile complain Cunming Liang
@ 2015-02-08 20:00           ` Olivier MATZ
  2015-02-09 13:50             ` Liang, Cunming
  0 siblings, 1 reply; 253+ messages in thread
From: Olivier MATZ @ 2015-02-08 20:00 UTC (permalink / raw)
  To: Cunming Liang, dev

Hi,

On 02/02/2015 03:02 AM, Cunming Liang wrote:
> Some macro already been defined by freebsd 'sys/param.h'.
> 
> Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> ---
>  lib/librte_pmd_enic/enic.h        | 1 +
>  lib/librte_pmd_enic/enic_compat.h | 1 +
>  2 files changed, 2 insertions(+)
> 
> diff --git a/lib/librte_pmd_enic/enic.h b/lib/librte_pmd_enic/enic.h
> index c43417c..189c3b9 100644
> --- a/lib/librte_pmd_enic/enic.h
> +++ b/lib/librte_pmd_enic/enic.h
> @@ -66,6 +66,7 @@
>  #define ENIC_CALC_IP_CKSUM      1
>  #define ENIC_CALC_TCP_UDP_CKSUM 2
>  #define ENIC_MAX_MTU            9000
> +#undef PAGE_SIZE
>  #define PAGE_SIZE               4096
>  #define PAGE_ROUND_UP(x) \
>  	((((unsigned long)(x)) + PAGE_SIZE-1) & (~(PAGE_SIZE-1)))
> diff --git a/lib/librte_pmd_enic/enic_compat.h b/lib/librte_pmd_enic/enic_compat.h
> index b1af838..b84c766 100644
> --- a/lib/librte_pmd_enic/enic_compat.h
> +++ b/lib/librte_pmd_enic/enic_compat.h
> @@ -67,6 +67,7 @@
>  #define pr_warn(y, args...) dev_warning(0, y, ##args)
>  #define BUG() pr_err("BUG at %s:%d", __func__, __LINE__)
>  
> +#undef ALIGN
>  #define ALIGN(x, a)              __ALIGN_MASK(x, (typeof(x))(a)-1)
>  #define __ALIGN_MASK(x, mask)    (((x)+(mask))&~(mask))
>  #define udelay usleep
> 

Is the issue caused by a change you've made previously in the patch
series?

Wouldn't it be better to rename the macros in enic instead of doing
#undef?

Regards,
Olivier

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 10/17] malloc: fix the issue of SOCKET_ID_ANY
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 10/17] malloc: fix the issue of SOCKET_ID_ANY Cunming Liang
@ 2015-02-08 20:00           ` Olivier MATZ
  2015-02-09 14:08             ` Liang, Cunming
  0 siblings, 1 reply; 253+ messages in thread
From: Olivier MATZ @ 2015-02-08 20:00 UTC (permalink / raw)
  To: Cunming Liang, dev

Hi,

On 02/02/2015 03:02 AM, Cunming Liang wrote:
> Add check for rte_socket_id(), avoid get unexpected return like (-1).
> 
> Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> ---
>  lib/librte_malloc/malloc_heap.h | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/librte_malloc/malloc_heap.h b/lib/librte_malloc/malloc_heap.h
> index b4aec45..a47136d 100644
> --- a/lib/librte_malloc/malloc_heap.h
> +++ b/lib/librte_malloc/malloc_heap.h
> @@ -44,7 +44,12 @@ extern "C" {
>  static inline unsigned
>  malloc_get_numa_socket(void)
>  {
> -	return rte_socket_id();
> +	unsigned socket_id = rte_socket_id();
> +
> +	if (socket_id == (unsigned)SOCKET_ID_ANY)
> +		return 0;
> +
> +	return socket_id;
>  }
>  
>  void *
> 

The documentation off rte_malloc_socket() says:

@param socket
  NUMA socket to allocate memory on. If SOCKET_ID_ANY is used, this
  function will behave the same as rte_malloc().

void *
rte_malloc_socket(const char *type, size_t size, unsigned align, int
socket);


Your patch changes the behavior of rte_malloc() without explaining
why, and the documentation becomes wrong.

Can you explain why you need this change?

Regards,
Olivier

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 11/17] log: fix the gap to support non-EAL thread
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 11/17] log: fix the gap to support non-EAL thread Cunming Liang
@ 2015-02-08 20:01           ` Olivier MATZ
  2015-02-09 14:19             ` Liang, Cunming
  0 siblings, 1 reply; 253+ messages in thread
From: Olivier MATZ @ 2015-02-08 20:01 UTC (permalink / raw)
  To: Cunming Liang, dev

Hi,

On 02/02/2015 03:02 AM, Cunming Liang wrote:
> For those non-EAL thread, *_lcore_id* is invalid and probably larger than RTE_MAX_LCORE.
> The patch adds the check and allows only EAL thread using EAL per thread log level and log type.
> Others shares the global log level.
> 
> Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> ---
>  lib/librte_eal/common/eal_common_log.c  | 17 +++++++++++++++--
>  lib/librte_eal/common/include/rte_log.h |  5 +++++
>  2 files changed, 20 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/librte_eal/common/eal_common_log.c b/lib/librte_eal/common/eal_common_log.c
> index cf57619..e8dc94a 100644
> --- a/lib/librte_eal/common/eal_common_log.c
> +++ b/lib/librte_eal/common/eal_common_log.c
> @@ -193,11 +193,20 @@ rte_set_log_type(uint32_t type, int enable)
>  		rte_logs.type &= (~type);
>  }
>  
> +/* Get global log type */
> +uint32_t
> +rte_get_log_type(void)
> +{
> +	return rte_logs.type;
> +}
> +
>  /* get the current loglevel for the message beeing processed */
>  int rte_log_cur_msg_loglevel(void)
>  {
>  	unsigned lcore_id;
>  	lcore_id = rte_lcore_id();
> +	if (lcore_id >= RTE_MAX_LCORE)
> +		return rte_get_log_level();
>  	return log_cur_msg[lcore_id].loglevel;
>  }
>  
> @@ -206,6 +215,8 @@ int rte_log_cur_msg_logtype(void)
>  {
>  	unsigned lcore_id;
>  	lcore_id = rte_lcore_id();
> +	if (lcore_id >= RTE_MAX_LCORE)
> +		return rte_get_log_type();
>  	return log_cur_msg[lcore_id].logtype;
>  }
>  
> @@ -265,8 +276,10 @@ rte_vlog(__attribute__((unused)) uint32_t level,
>  
>  	/* save loglevel and logtype in a global per-lcore variable */
>  	lcore_id = rte_lcore_id();
> -	log_cur_msg[lcore_id].loglevel = level;
> -	log_cur_msg[lcore_id].logtype = logtype;
> +	if (lcore_id < RTE_MAX_LCORE) {
> +		log_cur_msg[lcore_id].loglevel = level;
> +		log_cur_msg[lcore_id].logtype = logtype;
> +	}
>  
>  	ret = vfprintf(f, format, ap);
>  	fflush(f);
> diff --git a/lib/librte_eal/common/include/rte_log.h b/lib/librte_eal/common/include/rte_log.h
> index db1ea08..f83a0d9 100644
> --- a/lib/librte_eal/common/include/rte_log.h
> +++ b/lib/librte_eal/common/include/rte_log.h
> @@ -144,6 +144,11 @@ uint32_t rte_get_log_level(void);
>  void rte_set_log_type(uint32_t type, int enable);
>  
>  /**
> + * Get the global log type.
> + */
> +uint32_t rte_get_log_type(void);
> +
> +/**
>   * Get the current loglevel for the message being processed.
>   *
>   * Before calling the user-defined stream for logging, the log
> 

Wouldn't it be better to change the variable:
static struct log_cur_msg log_cur_msg[RTE_MAX_LCORE];
into a pthread (tls) variable?

With your patch, the log level and log type are not saved for
non-EAL threads. If TLS were used, I think it would work in any case.

Regards,
Olivier

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 12/17] eal: set _lcore_id and _socket_id to (-1) by default
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 12/17] eal: set _lcore_id and _socket_id to (-1) by default Cunming Liang
@ 2015-02-08 20:01           ` Olivier MATZ
  2015-02-09 14:24             ` Liang, Cunming
  0 siblings, 1 reply; 253+ messages in thread
From: Olivier MATZ @ 2015-02-08 20:01 UTC (permalink / raw)
  To: Cunming Liang, dev

Hi,

On 02/02/2015 03:02 AM, Cunming Liang wrote:
> For those none EAL thread, *_lcore_id* shall always be LCORE_ID_ANY.
> The libraries using *_lcore_id* as index need to take care.
> *_socket_id* always be SOCKET_ID_ANY unitl the thread changes the affinity

unitl -> until

> by rte_thread_set_affinity()
> 
> Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> ---
>  lib/librte_eal/bsdapp/eal/eal_thread.c   | 4 ++--
>  lib/librte_eal/linuxapp/eal/eal_thread.c | 4 ++--
>  2 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c b/lib/librte_eal/bsdapp/eal/eal_thread.c
> index 5b16302..2b3c9a8 100644
> --- a/lib/librte_eal/bsdapp/eal/eal_thread.c
> +++ b/lib/librte_eal/bsdapp/eal/eal_thread.c
> @@ -56,8 +56,8 @@
>  #include "eal_private.h"
>  #include "eal_thread.h"
>  
> -RTE_DEFINE_PER_LCORE(unsigned, _lcore_id);
> -RTE_DEFINE_PER_LCORE(unsigned, _socket_id);
> +RTE_DEFINE_PER_LCORE(unsigned, _lcore_id) = (unsigned)LCORE_ID_ANY;
> +RTE_DEFINE_PER_LCORE(unsigned, _socket_id) = (unsigned)SOCKET_ID_ANY;
>  RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset);
>  
>  /*
> diff --git a/lib/librte_eal/linuxapp/eal/eal_thread.c b/lib/librte_eal/linuxapp/eal/eal_thread.c
> index 6eb1525..ab94e20 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_thread.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_thread.c
> @@ -57,8 +57,8 @@
>  #include "eal_private.h"
>  #include "eal_thread.h"
>  
> -RTE_DEFINE_PER_LCORE(unsigned, _lcore_id);
> -RTE_DEFINE_PER_LCORE(unsigned, _socket_id);
> +RTE_DEFINE_PER_LCORE(unsigned, _lcore_id) = (unsigned)LCORE_ID_ANY;
> +RTE_DEFINE_PER_LCORE(unsigned, _socket_id) = (unsigned)SOCKET_ID_ANY;
>  RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset);

As far as I understand, now a rte_lcore_id() can return LCORE_ID_ANY.
This should be modified in the rte_lcore_id() API comments.

Same for rte_socket_id().

I also wonder if the API of these functions should be modified to
return an int instead of an unsigned as LCORE_ID_ANY is -1.

Regards,
Olivier

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 14/17] mempool: add support to non-EAL thread
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 14/17] mempool: add support to non-EAL thread Cunming Liang
@ 2015-02-08 20:01           ` Olivier MATZ
  2015-02-09 14:41             ` Liang, Cunming
  0 siblings, 1 reply; 253+ messages in thread
From: Olivier MATZ @ 2015-02-08 20:01 UTC (permalink / raw)
  To: Cunming Liang, dev

Hi,

On 02/02/2015 03:02 AM, Cunming Liang wrote:
> For non-EAL thread, bypass per lcore cache, directly use ring pool.
> It allows using rte_mempool in either EAL thread or any user pthread.
> As in non-EAL thread, it directly rely on rte_ring and it's none preemptive.
> It doesn't suggest to run multi-pthread/cpu which compete the rte_mempool.
> It will get bad performance and has critical risk if scheduling policy is RT.
> 
> Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> ---
>  lib/librte_mempool/rte_mempool.h | 18 +++++++++++-------
>  1 file changed, 11 insertions(+), 7 deletions(-)
> 
> diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
> index 3314651..4845f27 100644
> --- a/lib/librte_mempool/rte_mempool.h
> +++ b/lib/librte_mempool/rte_mempool.h
> @@ -198,10 +198,12 @@ struct rte_mempool {
>   *   Number to add to the object-oriented statistics.
>   */
>  #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
> -#define __MEMPOOL_STAT_ADD(mp, name, n) do {			\
> -		unsigned __lcore_id = rte_lcore_id();		\
> -		mp->stats[__lcore_id].name##_objs += n;		\
> -		mp->stats[__lcore_id].name##_bulk += 1;		\
> +#define __MEMPOOL_STAT_ADD(mp, name, n) do {                    \
> +		unsigned __lcore_id = rte_lcore_id();           \
> +		if (__lcore_id < RTE_MAX_LCORE) {               \
> +			mp->stats[__lcore_id].name##_objs += n;	\
> +			mp->stats[__lcore_id].name##_bulk += 1;	\
> +		}                                               \

Does it mean that we have no statistics for non-EAL threads?
(same question for rings and timers in the next patches)


>  	} while(0)
>  #else
>  #define __MEMPOOL_STAT_ADD(mp, name, n) do {} while(0)
> @@ -767,8 +769,9 @@ __mempool_put_bulk(struct rte_mempool *mp, void * const *obj_table,
>  	__MEMPOOL_STAT_ADD(mp, put, n);
>  
>  #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0
> -	/* cache is not enabled or single producer */
> -	if (unlikely(cache_size == 0 || is_mp == 0))
> +	/* cache is not enabled or single producer or none EAL thread */
> +	if (unlikely(cache_size == 0 || is_mp == 0 ||
> +		     lcore_id >= RTE_MAX_LCORE))
>  		goto ring_enqueue;
>  
>  	/* Go straight to ring if put would overflow mem allocated for cache */
> @@ -952,7 +955,8 @@ __mempool_get_bulk(struct rte_mempool *mp, void **obj_table,
>  	uint32_t cache_size = mp->cache_size;
>  
>  	/* cache is not enabled or single consumer */
> -	if (unlikely(cache_size == 0 || is_mc == 0 || n >= cache_size))
> +	if (unlikely(cache_size == 0 || is_mc == 0 ||
> +		     n >= cache_size || lcore_id >= RTE_MAX_LCORE))
>  		goto ring_dequeue;
>  
>  	cache = &mp->local_cache[lcore_id];
> 

What is the performance impact of adding this test?


Regards,
Olivier

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 01/17] eal: add cpuset into per EAL thread lcore_config
  2015-02-08 19:59           ` Olivier MATZ
@ 2015-02-09 11:33             ` Liang, Cunming
  2015-02-09 17:06               ` Olivier MATZ
  0 siblings, 1 reply; 253+ messages in thread
From: Liang, Cunming @ 2015-02-09 11:33 UTC (permalink / raw)
  To: Olivier MATZ, dev



> -----Original Message-----
> From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> Sent: Monday, February 09, 2015 4:00 AM
> To: Liang, Cunming; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4 01/17] eal: add cpuset into per EAL thread
> lcore_config
> 
> Hi,
> 
> On 02/02/2015 03:02 AM, Cunming Liang wrote:
> > The patch adds 'cpuset' into per-lcore configure 'lcore_config[]',
> > as the lcore no longer always 1:1 pinning with physical cpu.
> > The lcore now stands for a EAL thread rather than a logical cpu.
> >
> > It doesn't change the default behavior of 1:1 mapping, but allows to
> > affinity the EAL thread to multiple cpus.
> >
> > [...]
> > diff --git a/lib/librte_eal/bsdapp/eal/eal_memory.c
> b/lib/librte_eal/bsdapp/eal/eal_memory.c
> > index 65ee87d..a34d500 100644
> > --- a/lib/librte_eal/bsdapp/eal/eal_memory.c
> > +++ b/lib/librte_eal/bsdapp/eal/eal_memory.c
> > @@ -45,6 +45,8 @@
> >  #include "eal_internal_cfg.h"
> >  #include "eal_filesystem.h"
> >
> > +/* avoid re-defined against with freebsd header */
> > +#undef PAGE_SIZE
> >  #define PAGE_SIZE (sysconf(_SC_PAGESIZE))
> 
> I don't see the link with the patch. Should this go somewhere else?
> 
> 
> >
> >  /*
> > diff --git a/lib/librte_eal/common/include/rte_lcore.h
> b/lib/librte_eal/common/include/rte_lcore.h
> > index 49b2c03..4c7d6bb 100644
> > --- a/lib/librte_eal/common/include/rte_lcore.h
> > +++ b/lib/librte_eal/common/include/rte_lcore.h
> > @@ -50,6 +50,13 @@ extern "C" {
> >
> >  #define LCORE_ID_ANY -1    /**< Any lcore. */
> >
> > +#if defined(__linux__)
> > +	typedef	cpu_set_t rte_cpuset_t;
> > +#elif defined(__FreeBSD__)
> > +#include <pthread_np.h>
> > +	typedef cpuset_t rte_cpuset_t;
> > +#endif
> > +
> 
> Should we also define RTE_CPU_SETSIZE?
> For linux, should <sched.h> be included?
[LCM] It uses the fix size cpuset, won't use CPU_ALLOC() to get the pointer of cpuset.
The RTE_CPU_SETSIZE always equal to sizeof(rte_cpuset_t).
> 
> If I understand well, after the patch series, the user of
> rte_thread_set_affinity() and rte_thread_get_affinity() are
> supposed to use the macros from sched.h to access to this
> cpuset parameter. So I'm wondering if it's not better to
> use cpu_set_t from libc instead of redefining rte_cpuset_t.
> 
> To reword my question: what is the purpose of redefining
> cpu_set_t in rte_cpuset_t if we still need to use all the
> libc API to access to it?
[LCM] In linux the type is *cpu_set_t*, but in freebsd it's *cpuset_t*.
The purpose of *rte_cpuset_t* is to make the consistent type definition in EAL, and to avoid lots of #ifdef for this diff.
In either linux or freebsd, it still can use the MACRO in libc to set the rte_cpuset_t.
> 
> 
> Regards,
> Olivier

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 02/17] eal: new eal option '--lcores' for cpu assignment
  2015-02-08 19:59           ` Olivier MATZ
@ 2015-02-09 11:45             ` Liang, Cunming
  0 siblings, 0 replies; 253+ messages in thread
From: Liang, Cunming @ 2015-02-09 11:45 UTC (permalink / raw)
  To: Olivier MATZ, dev



> -----Original Message-----
> From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> Sent: Monday, February 09, 2015 4:00 AM
> To: Liang, Cunming; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4 02/17] eal: new eal option '--lcores' for cpu
> assignment
> 
> Hi,
> 
> On 02/02/2015 03:02 AM, Cunming Liang wrote:
> > It supports one new eal long option '--lcores' for EAL thread cpuset assignment.
> >
> > The format pattern:
> > 	--lcores='lcores[@cpus]<,lcores[@cpus]>'
> > lcores, cpus could be a single digit/range or a group.
> > '(' and ')' are necessary if it's a group.
> > If not supply '@cpus', the value of cpus uses the same as lcores.
> >
> > e.g. '1,2@(5-7),(3-5)@(0,2),(0,6),7-8' means starting 9 EAL thread as below
> >   lcore 0 runs on cpuset 0x41 (cpu 0,6)
> >   lcore 1 runs on cpuset 0x2 (cpu 1)
> >   lcore 2 runs on cpuset 0xe0 (cpu 5,6,7)
> >   lcore 3,4,5 runs on cpuset 0x5 (cpu 0,2)
> >   lcore 6 runs on cpuset 0x41 (cpu 0,6)
> >   lcore 7 runs on cpuset 0x80 (cpu 7)
> >   lcore 8 runs on cpuset 0x100 (cpu 8)
> >
> > Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> > ---
> >  lib/librte_eal/common/eal_common_launch.c  |   1 -
> >  lib/librte_eal/common/eal_common_options.c | 300
> ++++++++++++++++++++++++++++-
> >  lib/librte_eal/common/eal_options.h        |   2 +
> >  lib/librte_eal/linuxapp/eal/Makefile       |   1 +
> >  4 files changed, 299 insertions(+), 5 deletions(-)
> >
> > diff --git a/lib/librte_eal/common/eal_common_launch.c
> b/lib/librte_eal/common/eal_common_launch.c
> > index 599f83b..2d732b1 100644
> > --- a/lib/librte_eal/common/eal_common_launch.c
> > +++ b/lib/librte_eal/common/eal_common_launch.c
> > @@ -117,4 +117,3 @@ rte_eal_mp_wait_lcore(void)
> >  		rte_eal_wait_lcore(lcore_id);
> >  	}
> >  }
> > -
> 
> 
> This line should be removed from the patch.
[LCM] Accept.
> 
> 
> > diff --git a/lib/librte_eal/common/eal_common_options.c
> b/lib/librte_eal/common/eal_common_options.c
> > index 67e02dc..29ebb6f 100644
> > --- a/lib/librte_eal/common/eal_common_options.c
> > +++ b/lib/librte_eal/common/eal_common_options.c
> > @@ -45,6 +45,7 @@
> >  #include <rte_lcore.h>
> >  #include <rte_version.h>
> >  #include <rte_devargs.h>
> > +#include <rte_memcpy.h>
> >
> >  #include "eal_internal_cfg.h"
> >  #include "eal_options.h"
> > @@ -85,6 +86,7 @@ eal_long_options[] = {
> >  	{OPT_XEN_DOM0, 0, 0, OPT_XEN_DOM0_NUM},
> >  	{OPT_CREATE_UIO_DEV, 1, NULL, OPT_CREATE_UIO_DEV_NUM},
> >  	{OPT_VFIO_INTR, 1, NULL, OPT_VFIO_INTR_NUM},
> > +	{OPT_LCORES, 1, 0, OPT_LCORES_NUM},
> >  	{0, 0, 0, 0}
> >  };
> >
> > @@ -255,9 +257,11 @@ eal_parse_corelist(const char *corelist)
> >  			if (min == RTE_MAX_LCORE)
> >  				min = idx;
> >  			for (idx = min; idx <= max; idx++) {
> > -				cfg->lcore_role[idx] = ROLE_RTE;
> > -				lcore_config[idx].core_index = count;
> > -				count++;
> > +				if (cfg->lcore_role[idx] != ROLE_RTE) {
> > +					cfg->lcore_role[idx] = ROLE_RTE;
> > +					lcore_config[idx].core_index = count;
> > +					count++;
> > +				}
> >  			}
> >  			min = RTE_MAX_LCORE;
> >  		} else
> > @@ -292,6 +296,279 @@ eal_parse_master_lcore(const char *arg)
> >  	return 0;
> >  }
> >
> > +/*
> > + * Parse elem, the elem could be single number/range or '(' ')' group
> > + * Within group elem, '-' used for a range seperator;
> > + *                    ',' used for a single number.
> > + */
> > +static int
> > +eal_parse_set(const char *input, uint16_t set[], unsigned num)
> 
> It's not very clear what elem is. Maybe it could be a bit reworded.
> What about naming the function "eal_parse_cpuset()" instead?
[LCM] As it not only parse cpuset but also used for lcore set, so 'eal_parse_cpuset' is not accurate.
The set/elem here identify for a single number (e.g. 1), a number range (e.g. 4-6) or a group (e.g. (3,4-8,9) ).
I'll reword the comment for better understand. Thanks.
> 
> 
> > +{
> > +	unsigned idx;
> > +	const char *str = input;
> > +	char *end = NULL;
> > +	unsigned min, max;
> > +
> > +	memset(set, 0, num * sizeof(uint16_t));
> > +
> > +	while (isblank(*str))
> > +		str++;
> > +
> > +	/* only digit or left bracket is qulify for start point */
> > +	if ((!isdigit(*str) && *str != '(') || *str == '\0')
> > +		return -1;
> > +
> > +	/* process single number or single range of number */
> > +	if (*str != '(') {
> > +		errno = 0;
> > +		idx = strtoul(str, &end, 10);
> > +		if (errno || end == NULL || idx >= num)
> > +			return -1;
> > +		else {
> > +			while (isblank(*end))
> > +				end++;
> > +
> > +			min = idx;
> > +			max = idx;
> > +			if (*end == '-') {
> > +				/* proccess single <number>-<number> */
> > +				end++;
> > +				while (isblank(*end))
> > +					end++;
> > +				if (!isdigit(*end))
> > +					return -1;
> > +
> > +				errno = 0;
> > +				idx = strtoul(end, &end, 10);
> > +				if (errno || end == NULL || idx >= num)
> > +					return -1;
> > +				max = idx;
> > +				while (isblank(*end))
> > +					end++;
> > +				if (*end != ',' && *end != '\0')
> > +					return -1;
> > +			}
> > +
> > +			if (*end != ',' && *end != '\0' &&
> > +			    *end != '@')
> > +				return -1;
> > +
> > +			for (idx = RTE_MIN(min, max);
> > +			     idx <= RTE_MAX(min, max); idx++)
> > +				set[idx] = 1;
> > +
> > +			return end - input;
> > +		}
> > +	}
> > +
> > +	/* process set within bracket */
> > +	str++;
> > +	while (isblank(*str))
> > +		str++;
> > +	if (*str == '\0')
> > +		return -1;
> > +
> > +	min = RTE_MAX_LCORE;
> > +	do {
> > +
> > +		/* go ahead to the first digit */
> > +		while (isblank(*str))
> > +			str++;
> > +		if (!isdigit(*str))
> > +			return -1;
> > +
> > +		/* get the digit value */
> > +		errno = 0;
> > +		idx = strtoul(str, &end, 10);
> > +		if (errno || end == NULL || idx >= num)
> > +			return -1;
> > +
> > +		/* go ahead to separator '-',',' and ')' */
> > +		while (isblank(*end))
> > +			end++;
> > +		if (*end == '-') {
> > +			if (min == RTE_MAX_LCORE)
> > +				min = idx;
> > +			else /* avoid continuous '-' */
> > +				return -1;
> > +		} else if ((*end == ',') || (*end == ')')) {
> > +			max = idx;
> > +			if (min == RTE_MAX_LCORE)
> > +				min = idx;
> > +			for (idx = RTE_MIN(min, max);
> > +			     idx <= RTE_MAX(min, max); idx++)
> > +				set[idx] = 1;
> > +
> > +			min = RTE_MAX_LCORE;
> > +		} else
> > +			return -1;
> > +
> > +		str = end + 1;
> > +	} while (*end != '\0' && *end != ')');
> > +
> > +	return str - input;
> > +}
> 
> In the function above, there are some typos in the comments
>  seperator -> separator
>  qulify -> qualify
>  proccess -> process
[LCM] Sorry for that, will fix it.
> 
> > +
> > +/* convert from set array to cpuset bitmap */
> > +static inline int
> > +convert_to_cpuset(rte_cpuset_t *cpusetp,
> > +	      uint16_t *set, unsigned num)
> 
> I don't think the function should be inlined
[LCM] accept.
> 
> 
> 
> Regards,
> Olivier

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 03/17] eal: fix wrong strnlen() return value in 32bit icc
  2015-02-08 19:59           ` Olivier MATZ
@ 2015-02-09 11:57             ` Liang, Cunming
  2015-02-09 17:13               ` Olivier MATZ
  0 siblings, 1 reply; 253+ messages in thread
From: Liang, Cunming @ 2015-02-09 11:57 UTC (permalink / raw)
  To: Olivier MATZ, dev



> -----Original Message-----
> From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> Sent: Monday, February 09, 2015 4:00 AM
> To: Liang, Cunming; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4 03/17] eal: fix wrong strnlen() return value in
> 32bit icc
> 
> Hi,
> 
> On 02/02/2015 03:02 AM, Cunming Liang wrote:
> > The problem is that strnlen() here may return invalid value with 32bit icc.
> > (actually it returns it’s second parameter,e.g: sysconf(_SC_ARG_MAX)).
> > It starts to manifest hwen max_len parameter is > 2M and using icc –m32 –O2
> (or above).
> >
> > Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
> > Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> > ---
> >  lib/librte_eal/common/eal_common_options.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/lib/librte_eal/common/eal_common_options.c
> b/lib/librte_eal/common/eal_common_options.c
> > index 29ebb6f..22d5d37 100644
> > --- a/lib/librte_eal/common/eal_common_options.c
> > +++ b/lib/librte_eal/common/eal_common_options.c
> > @@ -227,7 +227,7 @@ eal_parse_corelist(const char *corelist)
> >  	/* Remove all blank characters ahead and after */
> >  	while (isblank(*corelist))
> >  		corelist++;
> > -	i = strnlen(corelist, sysconf(_SC_ARG_MAX));
> > +	i = strnlen(corelist, PATH_MAX);
> >  	while ((i > 0) && isblank(corelist[i - 1]))
> >  		i--;
> >
> > @@ -469,7 +469,7 @@ eal_parse_lcores(const char *lcores)
> >  	/* Remove all blank characters ahead and after */
> >  	while (isblank(*lcores))
> >  		lcores++;
> > -	i = strnlen(lcores, sysconf(_SC_ARG_MAX));
> > +	i = strnlen(lcores, PATH_MAX);
> >  	while ((i > 0) && isblank(lcores[i - 1]))
> >  		i--;
> >
> >
> 
> I think PATH_MAX is not equivalent to _SC_ARG_MAX.
> 
> But the main question is: why do we need to use strnlen() here instead
> of strlen? We can expect that argv[] pointers are always nul-terminated.
> Replacing them by strlen() would probably also solve the icc issue.
[LCM] You're right, here strlen() also solve icc issue and no risk for argv[].
But follows practice suggestion, keeping using those with 'n' function in DPDK is not bad.
There's additional two reason to keep strnlen and PATH_MAX.
1. PATH_MAX is defined as 4096 which is enough as our input. It doesn't matter to be _SC_ARG_MAX or not.
2. strnlen and PATH_MAX already used in eal_parse_coremask, to keep the style consistent in '-l' and '--lcores'.

> 
> Regards,
> Olivier

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 04/17] eal: add support parsing socket_id from cpuset
  2015-02-08 20:00           ` Olivier MATZ
@ 2015-02-09 12:26             ` Liang, Cunming
  2015-02-09 17:16               ` Olivier MATZ
  0 siblings, 1 reply; 253+ messages in thread
From: Liang, Cunming @ 2015-02-09 12:26 UTC (permalink / raw)
  To: Olivier MATZ, dev



> -----Original Message-----
> From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> Sent: Monday, February 09, 2015 4:00 AM
> To: Liang, Cunming; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4 04/17] eal: add support parsing socket_id
> from cpuset
> 
> Hi,
> 
> On 02/02/2015 03:02 AM, Cunming Liang wrote:
> > It returns the socket_id if all cpus in the cpuset belongs
> > to the same NUMA node, otherwise it will return SOCKET_ID_ANY.
> >
> > Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> > ---
> >  lib/librte_eal/bsdapp/eal/eal_lcore.c   |  7 +++++
> >  lib/librte_eal/common/eal_thread.h      | 52
> +++++++++++++++++++++++++++++++++
> >  lib/librte_eal/linuxapp/eal/eal_lcore.c |  7 +++++
> >  3 files changed, 66 insertions(+)
> >
> > diff --git a/lib/librte_eal/bsdapp/eal/eal_lcore.c
> b/lib/librte_eal/bsdapp/eal/eal_lcore.c
> > index 72f8ac2..162fb4f 100644
> > --- a/lib/librte_eal/bsdapp/eal/eal_lcore.c
> > +++ b/lib/librte_eal/bsdapp/eal/eal_lcore.c
> > @@ -41,6 +41,7 @@
> >  #include <rte_debug.h>
> >
> >  #include "eal_private.h"
> > +#include "eal_thread.h"
> >
> >  /* No topology information available on FreeBSD including NUMA info */
> >  #define cpu_core_id(X) 0
> > @@ -112,3 +113,9 @@ rte_eal_cpu_init(void)
> >
> >  	return 0;
> >  }
> > +
> > +unsigned
> > +eal_cpu_socket_id(__rte_unused unsigned cpu_id)
> > +{
> > +	return cpu_socket_id(cpu_id);
> > +}
> > diff --git a/lib/librte_eal/common/eal_thread.h
> b/lib/librte_eal/common/eal_thread.h
> > index b53b84d..a25ee86 100644
> > --- a/lib/librte_eal/common/eal_thread.h
> > +++ b/lib/librte_eal/common/eal_thread.h
> > @@ -34,6 +34,10 @@
> >  #ifndef EAL_THREAD_H
> >  #define EAL_THREAD_H
> >
> > +#include <sched.h>
> > +
> > +#include <rte_debug.h>
> > +
> >  /**
> >   * basic loop of thread, called for each thread by eal_init().
> >   *
> > @@ -50,4 +54,52 @@ __attribute__((noreturn)) void *eal_thread_loop(void
> *arg);
> >   */
> >  void eal_thread_init_master(unsigned lcore_id);
> >
> > +/**
> > + * Get the NUMA socket id from cpu id.
> > + * This function is private to EAL.
> > + *
> > + * @param cpu_id
> > + *   The logical process id.
> > + * @return
> > + *   socket_id or SOCKET_ID_ANY
> > + */
> > +unsigned eal_cpu_socket_id(unsigned cpu_id);
> 
> Wouldn't it be better to rename the existing function cpu_socket_id()
> in eal_cpu_socket_id() and export it in eal_thread.h?
> 
> In case of bsd where cpu_socket_id() is implemented using a #define,
> a new function should be created returning 0.
[LCM] In eal_lcore.c, the cpu_socket_id()/cpu_core_id() defined as static and only used in rte_eal_cpu_init().
I suppose the purpose of origin design is to make the sysfs parsing only visible in the file.
No matter remove the 'static' prefix of cpu_core_id() or add a new wrap eal_cpu_socket_id(), it results in a new extern EAL API.
So I prefer not change the visibility of the origin static function but have one as extern interface.
> 
> 
> > +
> > +/**
> > + * Get the NUMA socket id from cpuset.
> > + * This function is private to EAL.
> > + *
> > + * @param cpusetp
> > + *   The point to a valid cpu set.
> > + * @return
> > + *   socket_id or SOCKET_ID_ANY
> > + */
> > +static inline int
> > +eal_cpuset_socket_id(rte_cpuset_t *cpusetp)
> > +{
> > +	unsigned cpu = 0;
> > +	int socket_id = SOCKET_ID_ANY;
> > +	int sid;
> > +
> > +	if (cpusetp == NULL)
> > +		return SOCKET_ID_ANY;
> 
> SOCKET_ID_ANY is not defined, maybe <rte_lcore.h> should be included
> somewhere.
[LCM] Agree with you, eal_cpuset_socket_id() can move into eal_common_thread.c.
And add rte_memory.h for SOCKET_ID_ANY reference.
> 
> > +
> > +	do {
> > +		if (!CPU_ISSET(cpu, cpusetp))
> > +			continue;
> > +
> > +		if (socket_id == SOCKET_ID_ANY)
> > +			socket_id = eal_cpu_socket_id(cpu);
> > +
> > +		sid = eal_cpu_socket_id(cpu);
> > +		if (socket_id != sid) {
> > +			socket_id = SOCKET_ID_ANY;
> > +			break;
> > +		}
> > +
> > +	} while (++cpu < RTE_MAX_LCORE);
> > +
> > +	return socket_id;
> > +}
> 
> 
> I don't think this function should be inlined.
> 
> As this function is not used, it could be interesting for reviewers
> to understand when
[LCM] It's used in eal_thread_set_affinity() of eal_thread.c.
> 
> > +
> >  #endif /* EAL_THREAD_H */
> > diff --git a/lib/librte_eal/linuxapp/eal/eal_lcore.c
> b/lib/librte_eal/linuxapp/eal/eal_lcore.c
> > index 29615f8..922af6d 100644
> > --- a/lib/librte_eal/linuxapp/eal/eal_lcore.c
> > +++ b/lib/librte_eal/linuxapp/eal/eal_lcore.c
> > @@ -45,6 +45,7 @@
> >
> >  #include "eal_private.h"
> >  #include "eal_filesystem.h"
> > +#include "eal_thread.h"
> >
> >  #define SYS_CPU_DIR "/sys/devices/system/cpu/cpu%u"
> >  #define CORE_ID_FILE "topology/core_id"
> > @@ -197,3 +198,9 @@ rte_eal_cpu_init(void)
> >
> >  	return 0;
> >  }
> > +
> > +unsigned
> > +eal_cpu_socket_id(unsigned cpu_id)
> > +{
> > +	return cpu_socket_id(cpu_id);
> > +}
> >

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 05/17] eal: new TLS definition and API declaration
  2015-02-08 20:00           ` Olivier MATZ
@ 2015-02-09 12:45             ` Liang, Cunming
  2015-02-09 17:26               ` Olivier MATZ
  0 siblings, 1 reply; 253+ messages in thread
From: Liang, Cunming @ 2015-02-09 12:45 UTC (permalink / raw)
  To: Olivier MATZ, dev



> -----Original Message-----
> From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> Sent: Monday, February 09, 2015 4:00 AM
> To: Liang, Cunming; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4 05/17] eal: new TLS definition and API
> declaration
> 
> Hi,
> 
> On 02/02/2015 03:02 AM, Cunming Liang wrote:
> > 1. add two TLS *_socket_id* and *_cpuset*
> > 2. add two external API rte_thread_set/get_affinity
> > 3. add one internal API eal_thread_dump_affinity
> 
> To me, it's a bit strage to add an API withtout the associated code.
> Maybe you have a good reason to do that, but I think in this case it
> should be explained in the commit log.
[LCM] Accept.
> 
> >
> > Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> > ---
> >  lib/librte_eal/bsdapp/eal/eal_thread.c    |  2 ++
> >  lib/librte_eal/common/eal_thread.h        | 14 ++++++++++++++
> >  lib/librte_eal/common/include/rte_lcore.h | 29
> +++++++++++++++++++++++++++--
> >  lib/librte_eal/linuxapp/eal/eal_thread.c  |  2 ++
> >  4 files changed, 45 insertions(+), 2 deletions(-)
> >
> > diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c
> b/lib/librte_eal/bsdapp/eal/eal_thread.c
> > index ab05368..10220c7 100644
> > --- a/lib/librte_eal/bsdapp/eal/eal_thread.c
> > +++ b/lib/librte_eal/bsdapp/eal/eal_thread.c
> > @@ -56,6 +56,8 @@
> >  #include "eal_thread.h"
> >
> >  RTE_DEFINE_PER_LCORE(unsigned, _lcore_id);
> > +RTE_DEFINE_PER_LCORE(unsigned, _socket_id);
> > +RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset);
> >
> >  /*
> >   * Send a message to a slave lcore identified by slave_id to call a
> > diff --git a/lib/librte_eal/common/eal_thread.h
> b/lib/librte_eal/common/eal_thread.h
> > index a25ee86..28edf51 100644
> > --- a/lib/librte_eal/common/eal_thread.h
> > +++ b/lib/librte_eal/common/eal_thread.h
> > @@ -102,4 +102,18 @@ eal_cpuset_socket_id(rte_cpuset_t *cpusetp)
> >  	return socket_id;
> >  }
> >
> > +/**
> > + * Dump the current pthread cpuset.
> > + * This function is private to EAL.
> > + *
> > + * @param str
> > + *   The string buffer the cpuset will dump to.
> > + * @param size
> > + *   The string buffer size.
> > + */
> > +#define CPU_STR_LEN            256
> > +void
> > +eal_thread_dump_affinity(char str[], unsigned size);
> 
> Although it's equivalent for function arguments, I think "char *str" is
> usually preferred over "char str[]". See for instance in snprintf() or
> fgets().
[LCM] Accept.
> 
> What is the purpose of CPU_STR_LEN?
[LCM] For default quick reference for str[] definition used in dump_affinity()
> 
> What occurs if the size of the dump is greater than the size of the
> given buffer? Is the string truncated? Is there a \0 at the end?
[LCM] Yes, always have a '\0' in the end.
> This should be described in the API comments.
[LCM] Accept.
> Maybe adding a return
> value could help the user to determine if the string was truncated.
[LCM] Good idea, so the user can continue to print '...' for the truncated part.
> 
> > +
> > +
> >  #endif /* EAL_THREAD_H */
> > diff --git a/lib/librte_eal/common/include/rte_lcore.h
> b/lib/librte_eal/common/include/rte_lcore.h
> > index 4c7d6bb..facdbdc 100644
> > --- a/lib/librte_eal/common/include/rte_lcore.h
> > +++ b/lib/librte_eal/common/include/rte_lcore.h
> > @@ -43,6 +43,7 @@
> >  #include <rte_per_lcore.h>
> >  #include <rte_eal.h>
> >  #include <rte_launch.h>
> > +#include <rte_memory.h>
> >
> >  #ifdef __cplusplus
> >  extern "C" {
> > @@ -80,7 +81,9 @@ struct lcore_config {
> >   */
> >  extern struct lcore_config lcore_config[RTE_MAX_LCORE];
> >
> > -RTE_DECLARE_PER_LCORE(unsigned, _lcore_id); /**< Per core "core id". */
> > +RTE_DECLARE_PER_LCORE(unsigned, _lcore_id);  /**< Per thread "lcore id".
> */
> > +RTE_DECLARE_PER_LCORE(unsigned, _socket_id); /**< Per thread "socket id".
> */
> > +RTE_DECLARE_PER_LCORE(rte_cpuset_t, _cpuset); /**< Per thread "cpuset".
> */
> >
> >  /**
> >   * Return the ID of the execution unit we are running on.
> > @@ -146,7 +149,7 @@ rte_lcore_index(int lcore_id)
> >  static inline unsigned
> >  rte_socket_id(void)
> >  {
> > -	return lcore_config[rte_lcore_id()].socket_id;
> > +	return RTE_PER_LCORE(_socket_id);
> >  }
> 
> I don't see where the _socket_id variable is assigned. I think there
> is probably an issue with the splitting of the patches.
[LCM] The value initializes as SOCKET_ID_ANY when RTE_DEFINE_PER_LCORE().
And updated in eal_thread_set_affinity() for EAL thread and rte_thread_set_affinity() for non-EAL thread.
> 
> Regards,
> Olivier

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 06/17] eal: add eal_common_thread.c for common thread API
  2015-02-08 20:00           ` Olivier MATZ
@ 2015-02-09 13:12             ` Liang, Cunming
  2015-02-09 17:30               ` Olivier MATZ
  0 siblings, 1 reply; 253+ messages in thread
From: Liang, Cunming @ 2015-02-09 13:12 UTC (permalink / raw)
  To: Olivier MATZ, dev



> -----Original Message-----
> From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> Sent: Monday, February 09, 2015 4:00 AM
> To: Liang, Cunming; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4 06/17] eal: add eal_common_thread.c for
> common thread API
> 
> Hi,
> 
> On 02/02/2015 03:02 AM, Cunming Liang wrote:
> > The API works for both EAL thread and none EAL thread.
> > When calling rte_thread_set_affinity, the *_socket_id* and
> > *_cpuset* of calling thread will be updated if the thread
> > successful set the cpu affinity.
> >
> > [...]
> > +int
> > +rte_thread_set_affinity(rte_cpuset_t *cpusetp)
> > +{
> > +	int s;
> > +	unsigned lcore_id;
> > +	pthread_t tid;
> > +
> > +	if (!cpusetp)
> > +		return -1;
> 
> Is it really needed to test that cpusetp is not NULL?
[LCM] Accept, we can ignore it and depend on pthread_setaffinity_np() to return failure.
> 
> > +
> > +	lcore_id = rte_lcore_id();
> > +	if (lcore_id != (unsigned)LCORE_ID_ANY) {
> 
> This is strange to see something that cannot happen:
> lcore_id == LCORE_ID_ANY is only possible after your patch is 12/17
> is added. Maybe it can be reordered to avoid this inconsistency?
[LCM] You're right, here do some re-order.
The point is to make everything ready before switching the default value to -1.
And we can have the whole function implement in one patch.
It just won't take effect, but won't bring additional risk.
> 
> > +		/* EAL thread */
> > +		tid = lcore_config[lcore_id].thread_id;
> > +
> > +		s = pthread_setaffinity_np(tid, sizeof(rte_cpuset_t), cpusetp);
> > +		if (s != 0) {
> > +			RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
> > +			return -1;
> > +		}
> > +
> > +		/* store socket_id in TLS for quick access */
> > +		RTE_PER_LCORE(_socket_id) =
> > +			eal_cpuset_socket_id(cpusetp);
> > +
> > +		/* store cpuset in TLS for quick access */
> > +		rte_memcpy(&RTE_PER_LCORE(_cpuset), cpusetp,
> > +			   sizeof(rte_cpuset_t));
> > +
> > +		/* update lcore_config */
> > +		lcore_config[lcore_id].socket_id = RTE_PER_LCORE(_socket_id);
> > +		rte_memcpy(&lcore_config[lcore_id].cpuset, cpusetp,
> > +			   sizeof(rte_cpuset_t));
> > +	} else {
> > +		/* none EAL thread */
> > +		tid = pthread_self();
> > +
> > +		s = pthread_setaffinity_np(tid, sizeof(rte_cpuset_t), cpusetp);
> > +		if (s != 0) {
> > +			RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
> > +			return -1;
> > +		}
> > +
> > +		/* store cpuset in TLS for quick access */
> > +		rte_memcpy(&RTE_PER_LCORE(_cpuset), cpusetp,
> > +			   sizeof(rte_cpuset_t));
> > +
> > +		/* store socket_id in TLS for quick access */
> > +		RTE_PER_LCORE(_socket_id) =
> > +			eal_cpuset_socket_id(cpusetp);
> > +	}
> 
> Why not always using pthread_self() to get the tid?
[LCM] Good point, I haven't notice it.
> 
> I think most of the code could be factorized here. The only difference
> (which is hard to see as is as code is not exactly ordered in the same
> manner) is that the config is updated in case it's an EAL thread.
[LCM] Accept.
> 
> 
> 
> > +
> > +	return 0;
> > +}
> > +
> > +int
> > +rte_thread_get_affinity(rte_cpuset_t *cpusetp)
> > +{
> > +	if (!cpusetp)
> > +		return -1;
> 
> Same here. This is the only reason why rte_thread_get_affinity() could
> fail. Removing this test would allow to change the API to return void
> instead. It will avoid a useless test below in
> eal_thread_dump_affinity().
[LCM] The cpusetp is used as destination of memcpy and the function suppose an EAL API.
I don't think it's a good idea to remove the check, do you ?
> 
> > +
> > +	rte_memcpy(cpusetp, &RTE_PER_LCORE(_cpuset),
> > +		   sizeof(rte_cpuset_t));
> > +
> > +	return 0;
> > +}
> > +
> > +void
> > +eal_thread_dump_affinity(char str[], unsigned size)
> > +{
> > +	rte_cpuset_t cpuset;
> > +	unsigned cpu;
> > +	int ret;
> > +	unsigned int out = 0;
> > +
> > +	if (rte_thread_get_affinity(&cpuset) < 0) {
> > +		str[0] = '\0';
> > +		return;
> > +	}
> 
> This one could be removed it the (== NULL) test is removed.
> 
> > +
> > +	for (cpu = 0; cpu < RTE_MAX_LCORE; cpu++) {
> > +		if (!CPU_ISSET(cpu, &cpuset))
> > +			continue;
> > +
> > +		ret = snprintf(str + out,
> > +			       size - out, "%u,", cpu);
> > +		if (ret < 0 || (unsigned)ret >= size - out)
> > +			break;
> 
> On the contrary, I think here returning an error to the user
> would be useful so he can knows that the dump is not complete.
[LCM] accept.
> 
> 
> Regards,
> Olivier

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 08/17] eal: apply affinity of EAL thread by assigned cpuset
  2015-02-08 20:00           ` Olivier MATZ
@ 2015-02-09 13:48             ` Liang, Cunming
  2015-02-09 17:36               ` Olivier MATZ
  0 siblings, 1 reply; 253+ messages in thread
From: Liang, Cunming @ 2015-02-09 13:48 UTC (permalink / raw)
  To: Olivier MATZ, dev



> -----Original Message-----
> From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> Sent: Monday, February 09, 2015 4:01 AM
> To: Liang, Cunming; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4 08/17] eal: apply affinity of EAL thread by
> assigned cpuset
> 
> Hi,
> 
> On 02/02/2015 03:02 AM, Cunming Liang wrote:
> > EAL threads use assigned cpuset to set core affinity during startup.
> > It keeps 1:1 mapping, if no '--lcores' option is used.
> >
> > [...]
> >
> >  lib/librte_eal/bsdapp/eal/eal.c          | 13 ++++---
> >  lib/librte_eal/bsdapp/eal/eal_thread.c   | 63 +++++++++---------------------
> >  lib/librte_eal/linuxapp/eal/eal.c        |  7 +++-
> >  lib/librte_eal/linuxapp/eal/eal_thread.c | 67 +++++++++++---------------------
> >  4 files changed, 54 insertions(+), 96 deletions(-)
> >
> > diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
> > index 69f3c03..98c5a83 100644
> > --- a/lib/librte_eal/bsdapp/eal/eal.c
> > +++ b/lib/librte_eal/bsdapp/eal/eal.c
> > @@ -432,6 +432,7 @@ rte_eal_init(int argc, char **argv)
> >  	int i, fctret, ret;
> >  	pthread_t thread_id;
> >  	static rte_atomic32_t run_once = RTE_ATOMIC32_INIT(0);
> > +	char cpuset[CPU_STR_LEN];
> >
> >  	if (!rte_atomic32_test_and_set(&run_once))
> >  		return -1;
> > @@ -502,13 +503,17 @@ rte_eal_init(int argc, char **argv)
> >  	if (rte_eal_pci_init() < 0)
> >  		rte_panic("Cannot init PCI\n");
> >
> > -	RTE_LOG(DEBUG, EAL, "Master core %u is ready (tid=%p)\n",
> > -		rte_config.master_lcore, thread_id);
> > -
> >  	eal_check_mem_on_local_socket();
> >
> >  	rte_eal_mcfg_complete();
> >
> > +	eal_thread_init_master(rte_config.master_lcore);
> > +
> > +	eal_thread_dump_affinity(cpuset, CPU_STR_LEN);
> > +
> > +	RTE_LOG(DEBUG, EAL, "Master lcore %u is ready (tid=%p;cpuset=[%s])\n",
> > +		rte_config.master_lcore, thread_id, cpuset);
> > +
> >  	if (rte_eal_dev_init() < 0)
> >  		rte_panic("Cannot init pmd devices\n");
> >
> > @@ -532,8 +537,6 @@ rte_eal_init(int argc, char **argv)
> >  			rte_panic("Cannot create thread\n");
> >  	}
> >
> > -	eal_thread_init_master(rte_config.master_lcore);
> > -
> >  	/*
> >  	 * Launch a dummy function on all slave lcores, so that master lcore
> >  	 * knows they are all ready when this function returns.
> 
> I wonder if changing this may have an impact on third-party drivers
> that already use a management thread. Before the patch, the init()
> function of the external library was called with default affinities,
> and now it's called with the affinity from master lcore.
> 
> I think it should at least be noticed in the commit log.
> 
> Why are you doing this change? (I don't say it's a bad change, but
> I don't understand why you are doing it here)
[LCM] To be honest, the main purpose is I don't found any reason to have linuxapp and freebsdapp in different init sequence.
I means in linux it init_master before dev_init(), but in freebsd it reverse.
And as the default value of TLS already changes, if dev_init() first and using those TLS, the result will be not in an EAL thread.
But actually they're in the EAL master thread. So I prefer to do the change follows linuxapp sequence.
> 
> 
> > diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c
> b/lib/librte_eal/bsdapp/eal/eal_thread.c
> > index d0c077b..5b16302 100644
> > --- a/lib/librte_eal/bsdapp/eal/eal_thread.c
> > +++ b/lib/librte_eal/bsdapp/eal/eal_thread.c
> > @@ -103,55 +103,27 @@ eal_thread_set_affinity(void)
> >  {
> >  	int s;
> >  	pthread_t thread;
> > -
> > -/*
> > - * According to the section VERSIONS of the CPU_ALLOC man page:
> > - *
> > - * The CPU_ZERO(), CPU_SET(), CPU_CLR(), and CPU_ISSET() macros were
> added
> > - * in glibc 2.3.3.
> > - *
> > - * CPU_COUNT() first appeared in glibc 2.6.
> > - *
> > - * CPU_AND(),     CPU_OR(),     CPU_XOR(),    CPU_EQUAL(),
> CPU_ALLOC(),
> > - * CPU_ALLOC_SIZE(), CPU_FREE(), CPU_ZERO_S(),  CPU_SET_S(),
> CPU_CLR_S(),
> > - * CPU_ISSET_S(),  CPU_AND_S(), CPU_OR_S(), CPU_XOR_S(), and
> CPU_EQUAL_S()
> > - * first appeared in glibc 2.7.
> > - */
> > -#if defined(CPU_ALLOC)
> > -	size_t size;
> > -	cpu_set_t *cpusetp;
> > -
> > -	cpusetp = CPU_ALLOC(RTE_MAX_LCORE);
> > -	if (cpusetp == NULL) {
> > -		RTE_LOG(ERR, EAL, "CPU_ALLOC failed\n");
> > -		return -1;
> > -	}
> > -
> > -	size = CPU_ALLOC_SIZE(RTE_MAX_LCORE);
> > -	CPU_ZERO_S(size, cpusetp);
> > -	CPU_SET_S(rte_lcore_id(), size, cpusetp);
> > +	unsigned lcore_id = rte_lcore_id();
> >
> >  	thread = pthread_self();
> > -	s = pthread_setaffinity_np(thread, size, cpusetp);
> > +	s = pthread_setaffinity_np(thread, sizeof(cpuset_t),
> > +				   &lcore_config[lcore_id].cpuset);
> >  	if (s != 0) {
> >  		RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
> > -		CPU_FREE(cpusetp);
> >  		return -1;
> >  	}
> >
> > -	CPU_FREE(cpusetp);
> > -#else /* CPU_ALLOC */
> > -	cpuset_t cpuset;
> > -	CPU_ZERO( &cpuset );
> > -	CPU_SET( rte_lcore_id(), &cpuset );
> > +	/* acquire system unique id  */
> > +	rte_gettid();
> 
> As suggested in the previous patch, I think having rte_init_tid() would
> be clearer here.
[LCM] Sorry, I didn't get your [PATCH v4 07/17] comments, probably the mailbox issue.
Do you suggest to have a rte_init_tid() but not do syscall on the first time ?
Any benefit, rte_gettid() looks like more simple and straight forward. 
> > +
> > +	/* store socket_id in TLS for quick access */
> > +	RTE_PER_LCORE(_socket_id) =
> > +		eal_cpuset_socket_id(&lcore_config[lcore_id].cpuset);
> > +
> > +	CPU_COPY(&lcore_config[lcore_id].cpuset, &RTE_PER_LCORE(_cpuset));
> > +
> > +	lcore_config[lcore_id].socket_id = RTE_PER_LCORE(_socket_id);
> >
> > -	thread = pthread_self();
> > -	s = pthread_setaffinity_np(thread, sizeof( cpuset ), &cpuset);
> > -	if (s != 0) {
> > -		RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
> > -		return -1;
> > -	}
> > -#endif
> 
> You are removing a lot of code that was using CPU_ALLOC().
> Are we sure that the cpuset_t type is large enough to store all the
> CPUs?
> 
> It looks the current value of CPU_SETSIZE is 1024 now, but I wonder
> if this code was written when this value was lower. Could you check if
> it can happen today (maybe with an old libc)? A problem can occur if
> the size of cpuset_t is lower that the size of RTE_MAX_LCORE.
[LCM] I found actually the MACRO is not just for support CPU_ALLOC(), but for linux or freebsd.
In freebsdapp, there's no CPU_ALLOC defined, it use fixed width *cpuset_t*.
In linuxapp, there's CPU_ALLOC defined, it use cpu_set_t* and dynamic CPU_ALLOC(RTE_MAX_LCORE).
But actually RTE_MAX_LCORE < 1024(sizeof(cpu_set_t)). 
After using rte_cpuset_t, there's no additional reason to use CPU_ALLOC only for linuxapp and choose a small but dynamic width.
> 
> 
> Regards,
> Olivier

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 09/17] enic: fix re-define freebsd compile complain
  2015-02-08 20:00           ` Olivier MATZ
@ 2015-02-09 13:50             ` Liang, Cunming
  0 siblings, 0 replies; 253+ messages in thread
From: Liang, Cunming @ 2015-02-09 13:50 UTC (permalink / raw)
  To: Olivier MATZ, dev



> -----Original Message-----
> From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> Sent: Monday, February 09, 2015 4:01 AM
> To: Liang, Cunming; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4 09/17] enic: fix re-define freebsd compile
> complain
> 
> Hi,
> 
> On 02/02/2015 03:02 AM, Cunming Liang wrote:
> > Some macro already been defined by freebsd 'sys/param.h'.
> >
> > Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> > ---
> >  lib/librte_pmd_enic/enic.h        | 1 +
> >  lib/librte_pmd_enic/enic_compat.h | 1 +
> >  2 files changed, 2 insertions(+)
> >
> > diff --git a/lib/librte_pmd_enic/enic.h b/lib/librte_pmd_enic/enic.h
> > index c43417c..189c3b9 100644
> > --- a/lib/librte_pmd_enic/enic.h
> > +++ b/lib/librte_pmd_enic/enic.h
> > @@ -66,6 +66,7 @@
> >  #define ENIC_CALC_IP_CKSUM      1
> >  #define ENIC_CALC_TCP_UDP_CKSUM 2
> >  #define ENIC_MAX_MTU            9000
> > +#undef PAGE_SIZE
> >  #define PAGE_SIZE               4096
> >  #define PAGE_ROUND_UP(x) \
> >  	((((unsigned long)(x)) + PAGE_SIZE-1) & (~(PAGE_SIZE-1)))
> > diff --git a/lib/librte_pmd_enic/enic_compat.h
> b/lib/librte_pmd_enic/enic_compat.h
> > index b1af838..b84c766 100644
> > --- a/lib/librte_pmd_enic/enic_compat.h
> > +++ b/lib/librte_pmd_enic/enic_compat.h
> > @@ -67,6 +67,7 @@
> >  #define pr_warn(y, args...) dev_warning(0, y, ##args)
> >  #define BUG() pr_err("BUG at %s:%d", __func__, __LINE__)
> >
> > +#undef ALIGN
> >  #define ALIGN(x, a)              __ALIGN_MASK(x, (typeof(x))(a)-1)
> >  #define __ALIGN_MASK(x, mask)    (((x)+(mask))&~(mask))
> >  #define udelay usleep
> >
> 
> Is the issue caused by a change you've made previously in the patch
> series?
[LCM] Yes, caused by [01/17] which include <pthread_np.h> in freebsdapp.
> 
> Wouldn't it be better to rename the macros in enic instead of doing
> #undef?
[LCM] Agree, will do it.
> 
> Regards,
> Olivier

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 10/17] malloc: fix the issue of SOCKET_ID_ANY
  2015-02-08 20:00           ` Olivier MATZ
@ 2015-02-09 14:08             ` Liang, Cunming
  2015-02-09 17:43               ` Olivier MATZ
  0 siblings, 1 reply; 253+ messages in thread
From: Liang, Cunming @ 2015-02-09 14:08 UTC (permalink / raw)
  To: Olivier MATZ, dev



> -----Original Message-----
> From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> Sent: Monday, February 09, 2015 4:01 AM
> To: Liang, Cunming; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4 10/17] malloc: fix the issue of SOCKET_ID_ANY
> 
> Hi,
> 
> On 02/02/2015 03:02 AM, Cunming Liang wrote:
> > Add check for rte_socket_id(), avoid get unexpected return like (-1).
> >
> > Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> > ---
> >  lib/librte_malloc/malloc_heap.h | 7 ++++++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/lib/librte_malloc/malloc_heap.h b/lib/librte_malloc/malloc_heap.h
> > index b4aec45..a47136d 100644
> > --- a/lib/librte_malloc/malloc_heap.h
> > +++ b/lib/librte_malloc/malloc_heap.h
> > @@ -44,7 +44,12 @@ extern "C" {
> >  static inline unsigned
> >  malloc_get_numa_socket(void)
> >  {
> > -	return rte_socket_id();
> > +	unsigned socket_id = rte_socket_id();
> > +
> > +	if (socket_id == (unsigned)SOCKET_ID_ANY)
> > +		return 0;
> > +
> > +	return socket_id;
> >  }
> >
> >  void *
> >
> 
> The documentation off rte_malloc_socket() says:
> 
> @param socket
>   NUMA socket to allocate memory on. If SOCKET_ID_ANY is used, this
>   function will behave the same as rte_malloc().
> 
> void *
> rte_malloc_socket(const char *type, size_t size, unsigned align, int
> socket);
> 
> 
> Your patch changes the behavior of rte_malloc() without explaining
> why, and the documentation becomes wrong.
> 
> Can you explain why you need this change?
[LCM] I don't think I change the declaration of rte_malloc_socket().
If socket_arg=SOCKET_ID_ANY, the socket value expect to the return value of malloc_get_numa_socket().
The malloc_get_numa_socket() supposed to return the correct TLS _socket_id.
It works fine for normal cases. But as we change the default value of TLS _socket_id to SOCKET_ID_ANY.
And one lcore can run on multiple cpu, if all cpus in the cpuset are not belongs to one NUMA node, the _socket_id would be SOCKET_ID_ANY.
When user call rte_malloc_socket(SOCKET_ID_ANY), it does provide the same behavior as rte_malloc().
They both will get socket_id from malloc_get_numa_socket(). The addition part is the exception path process.
> 
> Regards,
> Olivier

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 11/17] log: fix the gap to support non-EAL thread
  2015-02-08 20:01           ` Olivier MATZ
@ 2015-02-09 14:19             ` Liang, Cunming
  2015-02-09 17:44               ` Olivier MATZ
  0 siblings, 1 reply; 253+ messages in thread
From: Liang, Cunming @ 2015-02-09 14:19 UTC (permalink / raw)
  To: Olivier MATZ, dev



> -----Original Message-----
> From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> Sent: Monday, February 09, 2015 4:01 AM
> To: Liang, Cunming; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4 11/17] log: fix the gap to support non-EAL
> thread
> 
> Hi,
> 
> On 02/02/2015 03:02 AM, Cunming Liang wrote:
> > For those non-EAL thread, *_lcore_id* is invalid and probably larger than
> RTE_MAX_LCORE.
> > The patch adds the check and allows only EAL thread using EAL per thread log
> level and log type.
> > Others shares the global log level.
> >
> > Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> > ---
> >  lib/librte_eal/common/eal_common_log.c  | 17 +++++++++++++++--
> >  lib/librte_eal/common/include/rte_log.h |  5 +++++
> >  2 files changed, 20 insertions(+), 2 deletions(-)
> >
> > diff --git a/lib/librte_eal/common/eal_common_log.c
> b/lib/librte_eal/common/eal_common_log.c
> > index cf57619..e8dc94a 100644
> > --- a/lib/librte_eal/common/eal_common_log.c
> > +++ b/lib/librte_eal/common/eal_common_log.c
> > @@ -193,11 +193,20 @@ rte_set_log_type(uint32_t type, int enable)
> >  		rte_logs.type &= (~type);
> >  }
> >
> > +/* Get global log type */
> > +uint32_t
> > +rte_get_log_type(void)
> > +{
> > +	return rte_logs.type;
> > +}
> > +
> >  /* get the current loglevel for the message beeing processed */
> >  int rte_log_cur_msg_loglevel(void)
> >  {
> >  	unsigned lcore_id;
> >  	lcore_id = rte_lcore_id();
> > +	if (lcore_id >= RTE_MAX_LCORE)
> > +		return rte_get_log_level();
> >  	return log_cur_msg[lcore_id].loglevel;
> >  }
> >
> > @@ -206,6 +215,8 @@ int rte_log_cur_msg_logtype(void)
> >  {
> >  	unsigned lcore_id;
> >  	lcore_id = rte_lcore_id();
> > +	if (lcore_id >= RTE_MAX_LCORE)
> > +		return rte_get_log_type();
> >  	return log_cur_msg[lcore_id].logtype;
> >  }
> >
> > @@ -265,8 +276,10 @@ rte_vlog(__attribute__((unused)) uint32_t level,
> >
> >  	/* save loglevel and logtype in a global per-lcore variable */
> >  	lcore_id = rte_lcore_id();
> > -	log_cur_msg[lcore_id].loglevel = level;
> > -	log_cur_msg[lcore_id].logtype = logtype;
> > +	if (lcore_id < RTE_MAX_LCORE) {
> > +		log_cur_msg[lcore_id].loglevel = level;
> > +		log_cur_msg[lcore_id].logtype = logtype;
> > +	}
> >
> >  	ret = vfprintf(f, format, ap);
> >  	fflush(f);
> > diff --git a/lib/librte_eal/common/include/rte_log.h
> b/lib/librte_eal/common/include/rte_log.h
> > index db1ea08..f83a0d9 100644
> > --- a/lib/librte_eal/common/include/rte_log.h
> > +++ b/lib/librte_eal/common/include/rte_log.h
> > @@ -144,6 +144,11 @@ uint32_t rte_get_log_level(void);
> >  void rte_set_log_type(uint32_t type, int enable);
> >
> >  /**
> > + * Get the global log type.
> > + */
> > +uint32_t rte_get_log_type(void);
> > +
> > +/**
> >   * Get the current loglevel for the message being processed.
> >   *
> >   * Before calling the user-defined stream for logging, the log
> >
> 
> Wouldn't it be better to change the variable:
> static struct log_cur_msg log_cur_msg[RTE_MAX_LCORE];
> into a pthread (tls) variable?
> 
> With your patch, the log level and log type are not saved for
> non-EAL threads. If TLS were used, I think it would work in any case.
[LCM] Good point. But for this patch set, still suppose not involve big impact to EAL thread.
For improve non-EAL thread, we'll have a separate patch set for it.
> 
> Regards,
> Olivier

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 12/17] eal: set _lcore_id and _socket_id to (-1) by default
  2015-02-08 20:01           ` Olivier MATZ
@ 2015-02-09 14:24             ` Liang, Cunming
  2015-02-09 17:49               ` Olivier MATZ
  0 siblings, 1 reply; 253+ messages in thread
From: Liang, Cunming @ 2015-02-09 14:24 UTC (permalink / raw)
  To: Olivier MATZ, dev



> -----Original Message-----
> From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> Sent: Monday, February 09, 2015 4:01 AM
> To: Liang, Cunming; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4 12/17] eal: set _lcore_id and _socket_id to (-1)
> by default
> 
> Hi,
> 
> On 02/02/2015 03:02 AM, Cunming Liang wrote:
> > For those none EAL thread, *_lcore_id* shall always be LCORE_ID_ANY.
> > The libraries using *_lcore_id* as index need to take care.
> > *_socket_id* always be SOCKET_ID_ANY unitl the thread changes the affinity
> 
> unitl -> until
[LCM] accept.
> 
> > by rte_thread_set_affinity()
> >
> > Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> > ---
> >  lib/librte_eal/bsdapp/eal/eal_thread.c   | 4 ++--
> >  lib/librte_eal/linuxapp/eal/eal_thread.c | 4 ++--
> >  2 files changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c
> b/lib/librte_eal/bsdapp/eal/eal_thread.c
> > index 5b16302..2b3c9a8 100644
> > --- a/lib/librte_eal/bsdapp/eal/eal_thread.c
> > +++ b/lib/librte_eal/bsdapp/eal/eal_thread.c
> > @@ -56,8 +56,8 @@
> >  #include "eal_private.h"
> >  #include "eal_thread.h"
> >
> > -RTE_DEFINE_PER_LCORE(unsigned, _lcore_id);
> > -RTE_DEFINE_PER_LCORE(unsigned, _socket_id);
> > +RTE_DEFINE_PER_LCORE(unsigned, _lcore_id) = (unsigned)LCORE_ID_ANY;
> > +RTE_DEFINE_PER_LCORE(unsigned, _socket_id) = (unsigned)SOCKET_ID_ANY;
> >  RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset);
> >
> >  /*
> > diff --git a/lib/librte_eal/linuxapp/eal/eal_thread.c
> b/lib/librte_eal/linuxapp/eal/eal_thread.c
> > index 6eb1525..ab94e20 100644
> > --- a/lib/librte_eal/linuxapp/eal/eal_thread.c
> > +++ b/lib/librte_eal/linuxapp/eal/eal_thread.c
> > @@ -57,8 +57,8 @@
> >  #include "eal_private.h"
> >  #include "eal_thread.h"
> >
> > -RTE_DEFINE_PER_LCORE(unsigned, _lcore_id);
> > -RTE_DEFINE_PER_LCORE(unsigned, _socket_id);
> > +RTE_DEFINE_PER_LCORE(unsigned, _lcore_id) = (unsigned)LCORE_ID_ANY;
> > +RTE_DEFINE_PER_LCORE(unsigned, _socket_id) = (unsigned)SOCKET_ID_ANY;
> >  RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset);
> 
> As far as I understand, now a rte_lcore_id() can return LCORE_ID_ANY.
> This should be modified in the rte_lcore_id() API comments.
> 
> Same for rte_socket_id().
[LCM] accept.
> 
> I also wonder if the API of these functions should be modified to
> return an int instead of an unsigned as LCORE_ID_ANY is -1.
[LCM] I prefer not change the API definition. (unsigned)LCORE_ID_ANY already used before.
> 
> Regards,
> Olivier

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 14/17] mempool: add support to non-EAL thread
  2015-02-08 20:01           ` Olivier MATZ
@ 2015-02-09 14:41             ` Liang, Cunming
  2015-02-09 17:52               ` Olivier MATZ
  0 siblings, 1 reply; 253+ messages in thread
From: Liang, Cunming @ 2015-02-09 14:41 UTC (permalink / raw)
  To: Olivier MATZ, dev



> -----Original Message-----
> From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> Sent: Monday, February 09, 2015 4:01 AM
> To: Liang, Cunming; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4 14/17] mempool: add support to non-EAL
> thread
> 
> Hi,
> 
> On 02/02/2015 03:02 AM, Cunming Liang wrote:
> > For non-EAL thread, bypass per lcore cache, directly use ring pool.
> > It allows using rte_mempool in either EAL thread or any user pthread.
> > As in non-EAL thread, it directly rely on rte_ring and it's none preemptive.
> > It doesn't suggest to run multi-pthread/cpu which compete the rte_mempool.
> > It will get bad performance and has critical risk if scheduling policy is RT.
> >
> > Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> > ---
> >  lib/librte_mempool/rte_mempool.h | 18 +++++++++++-------
> >  1 file changed, 11 insertions(+), 7 deletions(-)
> >
> > diff --git a/lib/librte_mempool/rte_mempool.h
> b/lib/librte_mempool/rte_mempool.h
> > index 3314651..4845f27 100644
> > --- a/lib/librte_mempool/rte_mempool.h
> > +++ b/lib/librte_mempool/rte_mempool.h
> > @@ -198,10 +198,12 @@ struct rte_mempool {
> >   *   Number to add to the object-oriented statistics.
> >   */
> >  #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
> > -#define __MEMPOOL_STAT_ADD(mp, name, n) do {			\
> > -		unsigned __lcore_id = rte_lcore_id();		\
> > -		mp->stats[__lcore_id].name##_objs += n;		\
> > -		mp->stats[__lcore_id].name##_bulk += 1;		\
> > +#define __MEMPOOL_STAT_ADD(mp, name, n) do {                    \
> > +		unsigned __lcore_id = rte_lcore_id();           \
> > +		if (__lcore_id < RTE_MAX_LCORE) {               \
> > +			mp->stats[__lcore_id].name##_objs += n;	\
> > +			mp->stats[__lcore_id].name##_bulk += 1;	\
> > +		}                                               \
> 
> Does it mean that we have no statistics for non-EAL threads?
> (same question for rings and timers in the next patches)
[LCM] Yes, it is in this patch set, mainly focus on EAL thread and make sure no running issue on non-EAL thread.
For full non-EAL function, will have other patch set to enhance non-EAL thread as the 2nd step.
> 
> 
> >  	} while(0)
> >  #else
> >  #define __MEMPOOL_STAT_ADD(mp, name, n) do {} while(0)
> > @@ -767,8 +769,9 @@ __mempool_put_bulk(struct rte_mempool *mp, void
> * const *obj_table,
> >  	__MEMPOOL_STAT_ADD(mp, put, n);
> >
> >  #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0
> > -	/* cache is not enabled or single producer */
> > -	if (unlikely(cache_size == 0 || is_mp == 0))
> > +	/* cache is not enabled or single producer or none EAL thread */
> > +	if (unlikely(cache_size == 0 || is_mp == 0 ||
> > +		     lcore_id >= RTE_MAX_LCORE))
> >  		goto ring_enqueue;
> >
> >  	/* Go straight to ring if put would overflow mem allocated for cache */
> > @@ -952,7 +955,8 @@ __mempool_get_bulk(struct rte_mempool *mp, void
> **obj_table,
> >  	uint32_t cache_size = mp->cache_size;
> >
> >  	/* cache is not enabled or single consumer */
> > -	if (unlikely(cache_size == 0 || is_mc == 0 || n >= cache_size))
> > +	if (unlikely(cache_size == 0 || is_mc == 0 ||
> > +		     n >= cache_size || lcore_id >= RTE_MAX_LCORE))
> >  		goto ring_dequeue;
> >
> >  	cache = &mp->local_cache[lcore_id];
> >
> 
> What is the performance impact of adding this test?
[LCM] By perf in unit test, it's almost the same. But haven't measure EAL thread and non-EAL thread share the same mempool.
> 
> 
> Regards,
> Olivier

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 16/17] ring: add sched_yield to avoid spin forever
  2015-02-06 15:19           ` Olivier MATZ
@ 2015-02-09 15:43             ` Ananyev, Konstantin
  2015-02-10 16:53               ` Olivier MATZ
  0 siblings, 1 reply; 253+ messages in thread
From: Ananyev, Konstantin @ 2015-02-09 15:43 UTC (permalink / raw)
  To: Olivier MATZ, Liang, Cunming, dev

Hi Olivier,

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Olivier MATZ
> Sent: Friday, February 06, 2015 3:20 PM
> To: Liang, Cunming; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4 16/17] ring: add sched_yield to avoid spin forever
> 
> Hi,
> 
> On 02/02/2015 03:02 AM, Cunming Liang wrote:
> > Add a sched_yield() syscall if the thread spins for too long, waiting other thread to finish its operations on the ring.
> > That gives pre-empted thread a chance to proceed and finish with ring enqnue/dequeue operation.
> > The purpose is to reduce contention on the ring.
> >
> > Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> > ---
> >  lib/librte_ring/rte_ring.h | 35 +++++++++++++++++++++++++++++------
> >  1 file changed, 29 insertions(+), 6 deletions(-)
> >
> > diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
> > index 39bacdd..c402c73 100644
> > --- a/lib/librte_ring/rte_ring.h
> > +++ b/lib/librte_ring/rte_ring.h
> > @@ -126,6 +126,7 @@ struct rte_ring_debug_stats {
> >
> >  #define RTE_RING_NAMESIZE 32 /**< The maximum length of a ring name. */
> >  #define RTE_RING_MZ_PREFIX "RG_"
> > +#define RTE_RING_PAUSE_REP 0x100  /**< yield after num of times pause. */
> >
> >  /**
> >   * An RTE ring structure.
> > @@ -410,7 +411,7 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
> >  	uint32_t cons_tail, free_entries;
> >  	const unsigned max = n;
> >  	int success;
> > -	unsigned i;
> > +	unsigned i, rep;
> >  	uint32_t mask = r->prod.mask;
> >  	int ret;
> >
> > @@ -468,8 +469,19 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
> >  	 * If there are other enqueues in progress that preceded us,
> >  	 * we need to wait for them to complete
> >  	 */
> > -	while (unlikely(r->prod.tail != prod_head))
> > -		rte_pause();
> > +	do {
> > +		/* avoid spin too long waiting for other thread finish */
> > +		for (rep = RTE_RING_PAUSE_REP;
> > +		     rep != 0 && r->prod.tail != prod_head; rep--)
> > +			rte_pause();
> > +
> > +		/*
> > +		 * It gives pre-empted thread a chance to proceed and
> > +		 * finish with ring enqnue operation.
> > +		 */
> > +		if (rep == 0)
> > +			sched_yield();
> > +	} while (rep == 0);
> >
> >  	r->prod.tail = prod_next;
> >  	return ret;
> > @@ -589,7 +601,7 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
> >  	uint32_t cons_next, entries;
> >  	const unsigned max = n;
> >  	int success;
> > -	unsigned i;
> > +	unsigned i, rep;
> >  	uint32_t mask = r->prod.mask;
> >
> >  	/* move cons.head atomically */
> > @@ -634,8 +646,19 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
> >  	 * If there are other dequeues in progress that preceded us,
> >  	 * we need to wait for them to complete
> >  	 */
> > -	while (unlikely(r->cons.tail != cons_head))
> > -		rte_pause();
> > +	do {
> > +		/* avoid spin too long waiting for other thread finish */
> > +		for (rep = RTE_RING_PAUSE_REP;
> > +		     rep != 0 && r->cons.tail != cons_head; rep--)
> > +			rte_pause();
> > +
> > +		/*
> > +		 * It gives pre-empted thread a chance to proceed and
> > +		 * finish with ring denqnue operation.
> > +		 */
> > +		if (rep == 0)
> > +			sched_yield();
> > +	} while (rep == 0);
> >
> >  	__RING_STAT_ADD(r, deq_success, n);
> >  	r->cons.tail = cons_next;
> >
> 
> The ring library was designed with the assumption that the code is not
> preemptable. The code is lock-less but not wait-less. Actually, if the
> code is preempted at a bad moment, it can spin forever until it's
> unscheduled.
> 
> I wonder if adding a sched_yield() may not penalize the current
> implementations that only use one pthread per core? Even if there
> is only one pthread in the scheduler queue for this CPU, calling
> the scheduler code may cost thousands of cycles.
> 
> Also, where does this value "RTE_RING_PAUSE_REP 0x100" comes from?
> Why 0x100 is better than 42 or than 10000?

The idea was to have something few times bigger than actual number
active cores in the system, to minimise chance of  a sched_yield() being called
for the case when we have one thread per physical core.  
My thought was that having that many repeats would make such chance neglectable.
Though, right now, I don't have any data to back it up.     

> I think it could be good to check if there is a performance impact
> with this change, especially where there is a lot of contention on
> the ring. If it has an impact, what about adding a compile or runtime
> option?

Good idea, probably we should make RTE_RING_PAUSE_REP  configuration option
and let say avoid emitting ' sched_yield();' at all, if  RTE_RING_PAUSE_REP == 0.

Konstantin

> 
> 
> Regards,
> Olivier

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 01/17] eal: add cpuset into per EAL thread lcore_config
  2015-02-09 11:33             ` Liang, Cunming
@ 2015-02-09 17:06               ` Olivier MATZ
  2015-02-09 17:37                 ` Ananyev, Konstantin
  2015-02-10  0:45                 ` Liang, Cunming
  0 siblings, 2 replies; 253+ messages in thread
From: Olivier MATZ @ 2015-02-09 17:06 UTC (permalink / raw)
  To: Liang, Cunming, dev

Hi,

On 02/09/2015 12:33 PM, Liang, Cunming wrote:
>> On 02/02/2015 03:02 AM, Cunming Liang wrote:
>>> The patch adds 'cpuset' into per-lcore configure 'lcore_config[]',
>>> as the lcore no longer always 1:1 pinning with physical cpu.
>>> The lcore now stands for a EAL thread rather than a logical cpu.
>>>
>>> It doesn't change the default behavior of 1:1 mapping, but allows to
>>> affinity the EAL thread to multiple cpus.
>>>
>>> [...]
>>> diff --git a/lib/librte_eal/bsdapp/eal/eal_memory.c
>> b/lib/librte_eal/bsdapp/eal/eal_memory.c
>>> index 65ee87d..a34d500 100644
>>> --- a/lib/librte_eal/bsdapp/eal/eal_memory.c
>>> +++ b/lib/librte_eal/bsdapp/eal/eal_memory.c
>>> @@ -45,6 +45,8 @@
>>>  #include "eal_internal_cfg.h"
>>>  #include "eal_filesystem.h"
>>>
>>> +/* avoid re-defined against with freebsd header */
>>> +#undef PAGE_SIZE
>>>  #define PAGE_SIZE (sysconf(_SC_PAGESIZE))
>>
>> I don't see the link with the patch. Should this go somewhere else?

Maybe you missed this one.


>>> diff --git a/lib/librte_eal/common/include/rte_lcore.h
>> b/lib/librte_eal/common/include/rte_lcore.h
>>> index 49b2c03..4c7d6bb 100644
>>> --- a/lib/librte_eal/common/include/rte_lcore.h
>>> +++ b/lib/librte_eal/common/include/rte_lcore.h
>>> @@ -50,6 +50,13 @@ extern "C" {
>>>
>>>  #define LCORE_ID_ANY -1    /**< Any lcore. */
>>>
>>> +#if defined(__linux__)
>>> +	typedef	cpu_set_t rte_cpuset_t;
>>> +#elif defined(__FreeBSD__)
>>> +#include <pthread_np.h>
>>> +	typedef cpuset_t rte_cpuset_t;
>>> +#endif
>>> +
>>
>> Should we also define RTE_CPU_SETSIZE?
>> For linux, should <sched.h> be included?
> [LCM] It uses the fix size cpuset, won't use CPU_ALLOC() to get the pointer of cpuset.
> The RTE_CPU_SETSIZE always equal to sizeof(rte_cpuset_t).

The advantage of using CPU_ALLOC() is to avoid issues when the number
of core will be higher than 1024. I agree it's probably a bit early
to think about this, but it could happen soon :)


>> If I understand well, after the patch series, the user of
>> rte_thread_set_affinity() and rte_thread_get_affinity() are
>> supposed to use the macros from sched.h to access to this
>> cpuset parameter. So I'm wondering if it's not better to
>> use cpu_set_t from libc instead of redefining rte_cpuset_t.
>>
>> To reword my question: what is the purpose of redefining
>> cpu_set_t in rte_cpuset_t if we still need to use all the
>> libc API to access to it?
> [LCM] In linux the type is *cpu_set_t*, but in freebsd it's *cpuset_t*.
> The purpose of *rte_cpuset_t* is to make the consistent type definition in EAL, and to avoid lots of #ifdef for this diff.
> In either linux or freebsd, it still can use the MACRO in libc to set the rte_cpuset_t.

OK, it makes sense then. I did not notice the difference between linux
and bsd.

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 03/17] eal: fix wrong strnlen() return value in 32bit icc
  2015-02-09 11:57             ` Liang, Cunming
@ 2015-02-09 17:13               ` Olivier MATZ
  2015-02-10  0:54                 ` Liang, Cunming
  0 siblings, 1 reply; 253+ messages in thread
From: Olivier MATZ @ 2015-02-09 17:13 UTC (permalink / raw)
  To: Liang, Cunming, dev

Hi,

On 02/09/2015 12:57 PM, Liang, Cunming wrote:
>>> @@ -469,7 +469,7 @@ eal_parse_lcores(const char *lcores)
>>>  	/* Remove all blank characters ahead and after */
>>>  	while (isblank(*lcores))
>>>  		lcores++;
>>> -	i = strnlen(lcores, sysconf(_SC_ARG_MAX));
>>> +	i = strnlen(lcores, PATH_MAX);
>>>  	while ((i > 0) && isblank(lcores[i - 1]))
>>>  		i--;
>>>
>>>
>>
>> I think PATH_MAX is not equivalent to _SC_ARG_MAX.
>>
>> But the main question is: why do we need to use strnlen() here instead
>> of strlen? We can expect that argv[] pointers are always nul-terminated.
>> Replacing them by strlen() would probably also solve the icc issue.
> [LCM] You're right, here strlen() also solve icc issue and no risk for argv[].
> But follows practice suggestion, keeping using those with 'n' function in DPDK is not bad.
> There's additional two reason to keep strnlen and PATH_MAX.
> 1. PATH_MAX is defined as 4096 which is enough as our input. It doesn't matter to be _SC_ARG_MAX or not.

PATH_MAX is 4096 but it's not related to the maximum argument length.

> 2. strnlen and PATH_MAX already used in eal_parse_coremask, to keep the style consistent in '-l' and '--lcores'.

I don't think it's a valid argument.

What is the problem of using strlen()? It looks it solves all the
issues. Using strlen on valid strings is not a security issue.


Regards,
Olivier

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 04/17] eal: add support parsing socket_id from cpuset
  2015-02-09 12:26             ` Liang, Cunming
@ 2015-02-09 17:16               ` Olivier MATZ
  0 siblings, 0 replies; 253+ messages in thread
From: Olivier MATZ @ 2015-02-09 17:16 UTC (permalink / raw)
  To: Liang, Cunming, dev

Hi,

On 02/09/2015 01:26 PM, Liang, Cunming wrote:
>>> @@ -50,4 +54,52 @@ __attribute__((noreturn)) void *eal_thread_loop(void
>> *arg);
>>>   */
>>>  void eal_thread_init_master(unsigned lcore_id);
>>>
>>> +/**
>>> + * Get the NUMA socket id from cpu id.
>>> + * This function is private to EAL.
>>> + *
>>> + * @param cpu_id
>>> + *   The logical process id.
>>> + * @return
>>> + *   socket_id or SOCKET_ID_ANY
>>> + */
>>> +unsigned eal_cpu_socket_id(unsigned cpu_id);
>>
>> Wouldn't it be better to rename the existing function cpu_socket_id()
>> in eal_cpu_socket_id() and export it in eal_thread.h?
>>
>> In case of bsd where cpu_socket_id() is implemented using a #define,
>> a new function should be created returning 0.
> [LCM] In eal_lcore.c, the cpu_socket_id()/cpu_core_id() defined as static and only used in rte_eal_cpu_init().
> I suppose the purpose of origin design is to make the sysfs parsing only visible in the file.
> No matter remove the 'static' prefix of cpu_core_id() or add a new wrap eal_cpu_socket_id(), it results in a new extern EAL API.
> So I prefer not change the visibility of the origin static function but have one as extern interface.

Yes, but I don't see what is the advantage of using a wrapper.
If there is no advantage, I think the one with the less code is
better.



>>> +static inline int
>>> +eal_cpuset_socket_id(rte_cpuset_t *cpusetp)
>>> +{
>>> +	unsigned cpu = 0;
>>> +	int socket_id = SOCKET_ID_ANY;
>>> +	int sid;
>>> +
>>> +	if (cpusetp == NULL)
>>> +		return SOCKET_ID_ANY;
>>> +
>>> +	do {
>>> +		if (!CPU_ISSET(cpu, cpusetp))
>>> +			continue;
>>> +
>>> +		if (socket_id == SOCKET_ID_ANY)
>>> +			socket_id = eal_cpu_socket_id(cpu);
>>> +
>>> +		sid = eal_cpu_socket_id(cpu);
>>> +		if (socket_id != sid) {
>>> +			socket_id = SOCKET_ID_ANY;
>>> +			break;
>>> +		}
>>> +
>>> +	} while (++cpu < RTE_MAX_LCORE);
>>> +
>>> +	return socket_id;
>>> +}
>>
>>
>> I don't think this function should be inlined.
>>
>> As this function is not used, it could be interesting for reviewers
>> to understand when
> [LCM] It's used in eal_thread_set_affinity() of eal_thread.c.

As it's not visible in the patch, could you add an explanation in
the commit log?

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 05/17] eal: new TLS definition and API declaration
  2015-02-09 12:45             ` Liang, Cunming
@ 2015-02-09 17:26               ` Olivier MATZ
  2015-02-10  2:45                 ` Liang, Cunming
  0 siblings, 1 reply; 253+ messages in thread
From: Olivier MATZ @ 2015-02-09 17:26 UTC (permalink / raw)
  To: Liang, Cunming, dev

Hi,

On 02/09/2015 01:45 PM, Liang, Cunming wrote:
>>> +/**
>>> + * Dump the current pthread cpuset.
>>> + * This function is private to EAL.
>>> + *
>>> + * @param str
>>> + *   The string buffer the cpuset will dump to.
>>> + * @param size
>>> + *   The string buffer size.
>>> + */
>>> +#define CPU_STR_LEN            256
>>> +void
>>> +eal_thread_dump_affinity(char str[], unsigned size);
>>
>> Although it's equivalent for function arguments, I think "char *str" is
>> usually preferred over "char str[]". See for instance in snprintf() or
>> fgets().
> [LCM] Accept.
>>
>> What is the purpose of CPU_STR_LEN?
> [LCM] For default quick reference for str[] definition used in dump_affinity()

So the API comment of the function is not placed at the right
place.

A comment "Default buffer size to use with eal_thread_dump_affinity()"
should be added above CPU_STR_LEN. Also, it could be renamed in
RTE_CPU_STR_LEN or RTE_CPU_AFFINITY_STR_LEN.



>>> @@ -80,7 +81,9 @@ struct lcore_config {
>>>   */
>>>  extern struct lcore_config lcore_config[RTE_MAX_LCORE];
>>>
>>> -RTE_DECLARE_PER_LCORE(unsigned, _lcore_id); /**< Per core "core id". */
>>> +RTE_DECLARE_PER_LCORE(unsigned, _lcore_id);  /**< Per thread "lcore id".
>> */
>>> +RTE_DECLARE_PER_LCORE(unsigned, _socket_id); /**< Per thread "socket id".
>> */
>>> +RTE_DECLARE_PER_LCORE(rte_cpuset_t, _cpuset); /**< Per thread "cpuset".
>> */
>>>
>>>  /**
>>>   * Return the ID of the execution unit we are running on.
>>> @@ -146,7 +149,7 @@ rte_lcore_index(int lcore_id)
>>>  static inline unsigned
>>>  rte_socket_id(void)
>>>  {
>>> -	return lcore_config[rte_lcore_id()].socket_id;
>>> +	return RTE_PER_LCORE(_socket_id);
>>>  }
>>
>> I don't see where the _socket_id variable is assigned. I think there
>> is probably an issue with the splitting of the patches.
> [LCM] The value initializes as SOCKET_ID_ANY when RTE_DEFINE_PER_LCORE().
> And updated in eal_thread_set_affinity() for EAL thread and rte_thread_set_affinity() for non-EAL thread.

This is done in a later patches:

"eal: set _lcore_id and _socket_id to (-1) by default"
"eal: apply affinity of EAL thread by assigned cpuset"

That's why I said there is probably an issue with the ordering
of the patches as these values are used here but initialized
later in the series.

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 06/17] eal: add eal_common_thread.c for common thread API
  2015-02-09 13:12             ` Liang, Cunming
@ 2015-02-09 17:30               ` Olivier MATZ
  2015-02-10  2:46                 ` Liang, Cunming
  0 siblings, 1 reply; 253+ messages in thread
From: Olivier MATZ @ 2015-02-09 17:30 UTC (permalink / raw)
  To: Liang, Cunming, dev

Hi,

On 02/09/2015 02:12 PM, Liang, Cunming wrote:
>>> +int
>>> +rte_thread_get_affinity(rte_cpuset_t *cpusetp)
>>> +{
>>> +	if (!cpusetp)
>>> +		return -1;
>>
>> Same here. This is the only reason why rte_thread_get_affinity() could
>> fail. Removing this test would allow to change the API to return void
>> instead. It will avoid a useless test below in
>> eal_thread_dump_affinity().
> [LCM] The cpusetp is used as destination of memcpy and the function suppose an EAL API.
> I don't think it's a good idea to remove the check, do you ?

I know we often have debate on this subject on the list. My personal
opinion is that checking a NULL pointer in these cases is useless
because the user is suppose to give a non-NULL pointer. Returning
an error will result in managing an error for something that cannot
happen.

On the other hand, adding an assert() (or the dpdk equivalent) would
be a good idea.


>>
>>> +
>>> +	rte_memcpy(cpusetp, &RTE_PER_LCORE(_cpuset),
>>> +		   sizeof(rte_cpuset_t));
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +void
>>> +eal_thread_dump_affinity(char str[], unsigned size)
>>> +{
>>> +	rte_cpuset_t cpuset;
>>> +	unsigned cpu;
>>> +	int ret;
>>> +	unsigned int out = 0;
>>> +
>>> +	if (rte_thread_get_affinity(&cpuset) < 0) {
>>> +		str[0] = '\0';
>>> +		return;
>>> +	}
>>
>> This one could be removed it the (== NULL) test is removed.
>>
>>> +
>>> +	for (cpu = 0; cpu < RTE_MAX_LCORE; cpu++) {
>>> +		if (!CPU_ISSET(cpu, &cpuset))
>>> +			continue;
>>> +
>>> +		ret = snprintf(str + out,
>>> +			       size - out, "%u,", cpu);
>>> +		if (ret < 0 || (unsigned)ret >= size - out)
>>> +			break;
>>
>> On the contrary, I think here returning an error to the user
>> would be useful so he can knows that the dump is not complete.
> [LCM] accept.
>>
>>
>> Regards,
>> Olivier

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 08/17] eal: apply affinity of EAL thread by assigned cpuset
  2015-02-09 13:48             ` Liang, Cunming
@ 2015-02-09 17:36               ` Olivier MATZ
  2015-02-10  2:51                 ` Liang, Cunming
  0 siblings, 1 reply; 253+ messages in thread
From: Olivier MATZ @ 2015-02-09 17:36 UTC (permalink / raw)
  To: Liang, Cunming, dev

Hi,

On 02/09/2015 02:48 PM, Liang, Cunming wrote:
>> -----Original Message-----
>> From: Olivier MATZ [mailto:olivier.matz@6wind.com]
>> Sent: Monday, February 09, 2015 4:01 AM
>> To: Liang, Cunming; dev@dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH v4 08/17] eal: apply affinity of EAL thread by
>> assigned cpuset
>>
>> Hi,
>>
>> On 02/02/2015 03:02 AM, Cunming Liang wrote:
>>> EAL threads use assigned cpuset to set core affinity during startup.
>>> It keeps 1:1 mapping, if no '--lcores' option is used.
>>>
>>> [...]
>>>
>>>  lib/librte_eal/bsdapp/eal/eal.c          | 13 ++++---
>>>  lib/librte_eal/bsdapp/eal/eal_thread.c   | 63 +++++++++---------------------
>>>  lib/librte_eal/linuxapp/eal/eal.c        |  7 +++-
>>>  lib/librte_eal/linuxapp/eal/eal_thread.c | 67 +++++++++++---------------------
>>>  4 files changed, 54 insertions(+), 96 deletions(-)
>>>
>>> diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
>>> index 69f3c03..98c5a83 100644
>>> --- a/lib/librte_eal/bsdapp/eal/eal.c
>>> +++ b/lib/librte_eal/bsdapp/eal/eal.c
>>> @@ -432,6 +432,7 @@ rte_eal_init(int argc, char **argv)
>>>  	int i, fctret, ret;
>>>  	pthread_t thread_id;
>>>  	static rte_atomic32_t run_once = RTE_ATOMIC32_INIT(0);
>>> +	char cpuset[CPU_STR_LEN];
>>>
>>>  	if (!rte_atomic32_test_and_set(&run_once))
>>>  		return -1;
>>> @@ -502,13 +503,17 @@ rte_eal_init(int argc, char **argv)
>>>  	if (rte_eal_pci_init() < 0)
>>>  		rte_panic("Cannot init PCI\n");
>>>
>>> -	RTE_LOG(DEBUG, EAL, "Master core %u is ready (tid=%p)\n",
>>> -		rte_config.master_lcore, thread_id);
>>> -
>>>  	eal_check_mem_on_local_socket();
>>>
>>>  	rte_eal_mcfg_complete();
>>>
>>> +	eal_thread_init_master(rte_config.master_lcore);
>>> +
>>> +	eal_thread_dump_affinity(cpuset, CPU_STR_LEN);
>>> +
>>> +	RTE_LOG(DEBUG, EAL, "Master lcore %u is ready (tid=%p;cpuset=[%s])\n",
>>> +		rte_config.master_lcore, thread_id, cpuset);
>>> +
>>>  	if (rte_eal_dev_init() < 0)
>>>  		rte_panic("Cannot init pmd devices\n");
>>>
>>> @@ -532,8 +537,6 @@ rte_eal_init(int argc, char **argv)
>>>  			rte_panic("Cannot create thread\n");
>>>  	}
>>>
>>> -	eal_thread_init_master(rte_config.master_lcore);
>>> -
>>>  	/*
>>>  	 * Launch a dummy function on all slave lcores, so that master lcore
>>>  	 * knows they are all ready when this function returns.
>>
>> I wonder if changing this may have an impact on third-party drivers
>> that already use a management thread. Before the patch, the init()
>> function of the external library was called with default affinities,
>> and now it's called with the affinity from master lcore.
>>
>> I think it should at least be noticed in the commit log.
>>
>> Why are you doing this change? (I don't say it's a bad change, but
>> I don't understand why you are doing it here)
> [LCM] To be honest, the main purpose is I don't found any reason to have linuxapp and freebsdapp in different init sequence.
> I means in linux it init_master before dev_init(), but in freebsd it reverse.


I agree that's something we should fix.


> And as the default value of TLS already changes, if dev_init() first and using those TLS, the result will be not in an EAL thread.
> But actually they're in the EAL master thread. So I prefer to do the change follows linuxapp sequence.

That makes sense. Is it possible to have this reordering in a separate
patch? The title could be
"eal: standardize init sequence between linux and bsd"



>>
>>
>>> diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c
>> b/lib/librte_eal/bsdapp/eal/eal_thread.c
>>> index d0c077b..5b16302 100644
>>> --- a/lib/librte_eal/bsdapp/eal/eal_thread.c
>>> +++ b/lib/librte_eal/bsdapp/eal/eal_thread.c
>>> @@ -103,55 +103,27 @@ eal_thread_set_affinity(void)
>>>  {
>>>  	int s;
>>>  	pthread_t thread;
>>> -
>>> -/*
>>> - * According to the section VERSIONS of the CPU_ALLOC man page:
>>> - *
>>> - * The CPU_ZERO(), CPU_SET(), CPU_CLR(), and CPU_ISSET() macros were
>> added
>>> - * in glibc 2.3.3.
>>> - *
>>> - * CPU_COUNT() first appeared in glibc 2.6.
>>> - *
>>> - * CPU_AND(),     CPU_OR(),     CPU_XOR(),    CPU_EQUAL(),
>> CPU_ALLOC(),
>>> - * CPU_ALLOC_SIZE(), CPU_FREE(), CPU_ZERO_S(),  CPU_SET_S(),
>> CPU_CLR_S(),
>>> - * CPU_ISSET_S(),  CPU_AND_S(), CPU_OR_S(), CPU_XOR_S(), and
>> CPU_EQUAL_S()
>>> - * first appeared in glibc 2.7.
>>> - */
>>> -#if defined(CPU_ALLOC)
>>> -	size_t size;
>>> -	cpu_set_t *cpusetp;
>>> -
>>> -	cpusetp = CPU_ALLOC(RTE_MAX_LCORE);
>>> -	if (cpusetp == NULL) {
>>> -		RTE_LOG(ERR, EAL, "CPU_ALLOC failed\n");
>>> -		return -1;
>>> -	}
>>> -
>>> -	size = CPU_ALLOC_SIZE(RTE_MAX_LCORE);
>>> -	CPU_ZERO_S(size, cpusetp);
>>> -	CPU_SET_S(rte_lcore_id(), size, cpusetp);
>>> +	unsigned lcore_id = rte_lcore_id();
>>>
>>>  	thread = pthread_self();
>>> -	s = pthread_setaffinity_np(thread, size, cpusetp);
>>> +	s = pthread_setaffinity_np(thread, sizeof(cpuset_t),
>>> +				   &lcore_config[lcore_id].cpuset);
>>>  	if (s != 0) {
>>>  		RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
>>> -		CPU_FREE(cpusetp);
>>>  		return -1;
>>>  	}
>>>
>>> -	CPU_FREE(cpusetp);
>>> -#else /* CPU_ALLOC */
>>> -	cpuset_t cpuset;
>>> -	CPU_ZERO( &cpuset );
>>> -	CPU_SET( rte_lcore_id(), &cpuset );
>>> +	/* acquire system unique id  */
>>> +	rte_gettid();
>>
>> As suggested in the previous patch, I think having rte_init_tid() would
>> be clearer here.
> [LCM] Sorry, I didn't get your [PATCH v4 07/17] comments, probably the mailbox issue.
> Do you suggest to have a rte_init_tid() but not do syscall on the first time ?
> Any benefit, rte_gettid() looks like more simple and straight forward. 

I think the mail was properly sent, you can see it here:
http://dpdk.org/ml/archives/dev/2015-February/012556.html

Usually, "get" functions return a value and have no side effects.
"init" functions return nothing (or an error code) but have a
side effect which is to initialize an internal state.


>>> +
>>> +	/* store socket_id in TLS for quick access */
>>> +	RTE_PER_LCORE(_socket_id) =
>>> +		eal_cpuset_socket_id(&lcore_config[lcore_id].cpuset);
>>> +
>>> +	CPU_COPY(&lcore_config[lcore_id].cpuset, &RTE_PER_LCORE(_cpuset));
>>> +
>>> +	lcore_config[lcore_id].socket_id = RTE_PER_LCORE(_socket_id);
>>>
>>> -	thread = pthread_self();
>>> -	s = pthread_setaffinity_np(thread, sizeof( cpuset ), &cpuset);
>>> -	if (s != 0) {
>>> -		RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
>>> -		return -1;
>>> -	}
>>> -#endif
>>
>> You are removing a lot of code that was using CPU_ALLOC().
>> Are we sure that the cpuset_t type is large enough to store all the
>> CPUs?
>>
>> It looks the current value of CPU_SETSIZE is 1024 now, but I wonder
>> if this code was written when this value was lower. Could you check if
>> it can happen today (maybe with an old libc)? A problem can occur if
>> the size of cpuset_t is lower that the size of RTE_MAX_LCORE.
> [LCM] I found actually the MACRO is not just for support CPU_ALLOC(), but for linux or freebsd.
> In freebsdapp, there's no CPU_ALLOC defined, it use fixed width *cpuset_t*.
> In linuxapp, there's CPU_ALLOC defined, it use cpu_set_t* and dynamic CPU_ALLOC(RTE_MAX_LCORE).
> But actually RTE_MAX_LCORE < 1024(sizeof(cpu_set_t)). 
> After using rte_cpuset_t, there's no additional reason to use CPU_ALLOC only for linuxapp and choose a small but dynamic width.

I did a quick search on google, and it seems CPU_SETSIZE is 1024
for a long time. So you are right, there is probably no reason to
keep CPU_ALLOC(). As I said in a previous mail, it could be useful
in the future when the number of CPUs will reach 1024, but we have
some time to handle this.

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 01/17] eal: add cpuset into per EAL thread lcore_config
  2015-02-09 17:06               ` Olivier MATZ
@ 2015-02-09 17:37                 ` Ananyev, Konstantin
  2015-02-10  0:45                 ` Liang, Cunming
  1 sibling, 0 replies; 253+ messages in thread
From: Ananyev, Konstantin @ 2015-02-09 17:37 UTC (permalink / raw)
  To: Olivier MATZ, Liang, Cunming, dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Olivier MATZ
> Sent: Monday, February 09, 2015 5:07 PM
> To: Liang, Cunming; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4 01/17] eal: add cpuset into per EAL thread lcore_config
> 
> Hi,
> 
> On 02/09/2015 12:33 PM, Liang, Cunming wrote:
> >> On 02/02/2015 03:02 AM, Cunming Liang wrote:
> >>> The patch adds 'cpuset' into per-lcore configure 'lcore_config[]',
> >>> as the lcore no longer always 1:1 pinning with physical cpu.
> >>> The lcore now stands for a EAL thread rather than a logical cpu.
> >>>
> >>> It doesn't change the default behavior of 1:1 mapping, but allows to
> >>> affinity the EAL thread to multiple cpus.
> >>>
> >>> [...]
> >>> diff --git a/lib/librte_eal/bsdapp/eal/eal_memory.c
> >> b/lib/librte_eal/bsdapp/eal/eal_memory.c
> >>> index 65ee87d..a34d500 100644
> >>> --- a/lib/librte_eal/bsdapp/eal/eal_memory.c
> >>> +++ b/lib/librte_eal/bsdapp/eal/eal_memory.c
> >>> @@ -45,6 +45,8 @@
> >>>  #include "eal_internal_cfg.h"
> >>>  #include "eal_filesystem.h"
> >>>
> >>> +/* avoid re-defined against with freebsd header */
> >>> +#undef PAGE_SIZE
> >>>  #define PAGE_SIZE (sysconf(_SC_PAGESIZE))
> >>
> >> I don't see the link with the patch. Should this go somewhere else?
> 
> Maybe you missed this one.
> 
> 
> >>> diff --git a/lib/librte_eal/common/include/rte_lcore.h
> >> b/lib/librte_eal/common/include/rte_lcore.h
> >>> index 49b2c03..4c7d6bb 100644
> >>> --- a/lib/librte_eal/common/include/rte_lcore.h
> >>> +++ b/lib/librte_eal/common/include/rte_lcore.h
> >>> @@ -50,6 +50,13 @@ extern "C" {
> >>>
> >>>  #define LCORE_ID_ANY -1    /**< Any lcore. */
> >>>
> >>> +#if defined(__linux__)
> >>> +	typedef	cpu_set_t rte_cpuset_t;
> >>> +#elif defined(__FreeBSD__)
> >>> +#include <pthread_np.h>
> >>> +	typedef cpuset_t rte_cpuset_t;
> >>> +#endif
> >>> +
> >>
> >> Should we also define RTE_CPU_SETSIZE?
> >> For linux, should <sched.h> be included?
> > [LCM] It uses the fix size cpuset, won't use CPU_ALLOC() to get the pointer of cpuset.
> > The RTE_CPU_SETSIZE always equal to sizeof(rte_cpuset_t).
> 
> The advantage of using CPU_ALLOC() is to avoid issues when the number
> of core will be higher than 1024. I agree it's probably a bit early
> to think about this, but it could happen soon :)

I personally don't think, we'll hit 1K cpu limit anytime soon...
>From other side - fixed size cpuset allows to cleanup and simplify code quite a bit.
So, I'd suggest to stick with fixed size for now.
Konstantin

> 
> 
> >> If I understand well, after the patch series, the user of
> >> rte_thread_set_affinity() and rte_thread_get_affinity() are
> >> supposed to use the macros from sched.h to access to this
> >> cpuset parameter. So I'm wondering if it's not better to
> >> use cpu_set_t from libc instead of redefining rte_cpuset_t.
> >>
> >> To reword my question: what is the purpose of redefining
> >> cpu_set_t in rte_cpuset_t if we still need to use all the
> >> libc API to access to it?
> > [LCM] In linux the type is *cpu_set_t*, but in freebsd it's *cpuset_t*.
> > The purpose of *rte_cpuset_t* is to make the consistent type definition in EAL, and to avoid lots of #ifdef for this diff.
> > In either linux or freebsd, it still can use the MACRO in libc to set the rte_cpuset_t.
> 
> OK, it makes sense then. I did not notice the difference between linux
> and bsd.

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 10/17] malloc: fix the issue of SOCKET_ID_ANY
  2015-02-09 14:08             ` Liang, Cunming
@ 2015-02-09 17:43               ` Olivier MATZ
  0 siblings, 0 replies; 253+ messages in thread
From: Olivier MATZ @ 2015-02-09 17:43 UTC (permalink / raw)
  To: Liang, Cunming, dev

Hi,

On 02/09/2015 03:08 PM, Liang, Cunming wrote:
> 
> 
>> -----Original Message-----
>> From: Olivier MATZ [mailto:olivier.matz@6wind.com]
>> Sent: Monday, February 09, 2015 4:01 AM
>> To: Liang, Cunming; dev@dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH v4 10/17] malloc: fix the issue of SOCKET_ID_ANY
>>
>> Hi,
>>
>> On 02/02/2015 03:02 AM, Cunming Liang wrote:
>>> Add check for rte_socket_id(), avoid get unexpected return like (-1).
>>>
>>> Signed-off-by: Cunming Liang <cunming.liang@intel.com>
>>> ---
>>>  lib/librte_malloc/malloc_heap.h | 7 ++++++-
>>>  1 file changed, 6 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/lib/librte_malloc/malloc_heap.h b/lib/librte_malloc/malloc_heap.h
>>> index b4aec45..a47136d 100644
>>> --- a/lib/librte_malloc/malloc_heap.h
>>> +++ b/lib/librte_malloc/malloc_heap.h
>>> @@ -44,7 +44,12 @@ extern "C" {
>>>  static inline unsigned
>>>  malloc_get_numa_socket(void)
>>>  {
>>> -	return rte_socket_id();
>>> +	unsigned socket_id = rte_socket_id();
>>> +
>>> +	if (socket_id == (unsigned)SOCKET_ID_ANY)
>>> +		return 0;
>>> +
>>> +	return socket_id;
>>>  }
>>>
>>>  void *
>>>
>>
>> The documentation off rte_malloc_socket() says:
>>
>> @param socket
>>   NUMA socket to allocate memory on. If SOCKET_ID_ANY is used, this
>>   function will behave the same as rte_malloc().
>>
>> void *
>> rte_malloc_socket(const char *type, size_t size, unsigned align, int
>> socket);
>>
>>
>> Your patch changes the behavior of rte_malloc() without explaining
>> why, and the documentation becomes wrong.
>>
>> Can you explain why you need this change?
> [LCM] I don't think I change the declaration of rte_malloc_socket().
> If socket_arg=SOCKET_ID_ANY, the socket value expect to the return value of malloc_get_numa_socket().
> The malloc_get_numa_socket() supposed to return the correct TLS _socket_id.
> It works fine for normal cases. But as we change the default value of TLS _socket_id to SOCKET_ID_ANY.
> And one lcore can run on multiple cpu, if all cpus in the cpuset are not belongs to one NUMA node, the _socket_id would be SOCKET_ID_ANY.
> When user call rte_malloc_socket(SOCKET_ID_ANY), it does provide the same behavior as rte_malloc().
> They both will get socket_id from malloc_get_numa_socket(). The addition part is the exception path process.

Sorry, I checked again, you are right.

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 11/17] log: fix the gap to support non-EAL thread
  2015-02-09 14:19             ` Liang, Cunming
@ 2015-02-09 17:44               ` Olivier MATZ
  2015-02-10  2:56                 ` Liang, Cunming
  0 siblings, 1 reply; 253+ messages in thread
From: Olivier MATZ @ 2015-02-09 17:44 UTC (permalink / raw)
  To: Liang, Cunming, dev

Hi,

On 02/09/2015 03:19 PM, Liang, Cunming wrote:
>>> --- a/lib/librte_eal/common/include/rte_log.h
>>> +++ b/lib/librte_eal/common/include/rte_log.h
>>> @@ -144,6 +144,11 @@ uint32_t rte_get_log_level(void);
>>>  void rte_set_log_type(uint32_t type, int enable);
>>>
>>>  /**
>>> + * Get the global log type.
>>> + */
>>> +uint32_t rte_get_log_type(void);
>>> +
>>> +/**
>>>   * Get the current loglevel for the message being processed.
>>>   *
>>>   * Before calling the user-defined stream for logging, the log
>>>
>>
>> Wouldn't it be better to change the variable:
>> static struct log_cur_msg log_cur_msg[RTE_MAX_LCORE];
>> into a pthread (tls) variable?
>>
>> With your patch, the log level and log type are not saved for
>> non-EAL threads. If TLS were used, I think it would work in any case.
> [LCM] Good point. But for this patch set, still suppose not involve big impact to EAL thread.
> For improve non-EAL thread, we'll have a separate patch set for it.

OK, that's fine

Will it be for 2.0 or later?

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 12/17] eal: set _lcore_id and _socket_id to (-1) by default
  2015-02-09 14:24             ` Liang, Cunming
@ 2015-02-09 17:49               ` Olivier MATZ
  2015-02-10  2:53                 ` Liang, Cunming
  0 siblings, 1 reply; 253+ messages in thread
From: Olivier MATZ @ 2015-02-09 17:49 UTC (permalink / raw)
  To: Liang, Cunming, dev

Hi,

On 02/09/2015 03:24 PM, Liang, Cunming wrote:
>>> --- a/lib/librte_eal/linuxapp/eal/eal_thread.c
>>> +++ b/lib/librte_eal/linuxapp/eal/eal_thread.c
>>> @@ -57,8 +57,8 @@
>>>  #include "eal_private.h"
>>>  #include "eal_thread.h"
>>>
>>> -RTE_DEFINE_PER_LCORE(unsigned, _lcore_id);
>>> -RTE_DEFINE_PER_LCORE(unsigned, _socket_id);
>>> +RTE_DEFINE_PER_LCORE(unsigned, _lcore_id) = (unsigned)LCORE_ID_ANY;
>>> +RTE_DEFINE_PER_LCORE(unsigned, _socket_id) = (unsigned)SOCKET_ID_ANY;
>>>  RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset);
>>
>> As far as I understand, now a rte_lcore_id() can return LCORE_ID_ANY.
>> This should be modified in the rte_lcore_id() API comments.
>>
>> Same for rte_socket_id().
> [LCM] accept.
>>
>> I also wonder if the API of these functions should be modified to
>> return an int instead of an unsigned as LCORE_ID_ANY is -1.
> [LCM] I prefer not change the API definition. (unsigned)LCORE_ID_ANY already used before.

OK

And what about directly defining the following?

#define LCORE_ID_ANY ((unsigned)-1)


It would avoid the casts.

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 14/17] mempool: add support to non-EAL thread
  2015-02-09 14:41             ` Liang, Cunming
@ 2015-02-09 17:52               ` Olivier MATZ
  2015-02-10  2:57                 ` Liang, Cunming
  0 siblings, 1 reply; 253+ messages in thread
From: Olivier MATZ @ 2015-02-09 17:52 UTC (permalink / raw)
  To: Liang, Cunming, dev

Hi,

On 02/09/2015 03:41 PM, Liang, Cunming wrote:
>>>  #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
>>> -#define __MEMPOOL_STAT_ADD(mp, name, n) do {			\
>>> -		unsigned __lcore_id = rte_lcore_id();		\
>>> -		mp->stats[__lcore_id].name##_objs += n;		\
>>> -		mp->stats[__lcore_id].name##_bulk += 1;		\
>>> +#define __MEMPOOL_STAT_ADD(mp, name, n) do {                    \
>>> +		unsigned __lcore_id = rte_lcore_id();           \
>>> +		if (__lcore_id < RTE_MAX_LCORE) {               \
>>> +			mp->stats[__lcore_id].name##_objs += n;	\
>>> +			mp->stats[__lcore_id].name##_bulk += 1;	\
>>> +		}                                               \
>>
>> Does it mean that we have no statistics for non-EAL threads?
>> (same question for rings and timers in the next patches)
> [LCM] Yes, it is in this patch set, mainly focus on EAL thread and make sure no running issue on non-EAL thread.
> For full non-EAL function, will have other patch set to enhance non-EAL thread as the 2nd step.

OK

>>> @@ -952,7 +955,8 @@ __mempool_get_bulk(struct rte_mempool *mp, void
>> **obj_table,
>>>  	uint32_t cache_size = mp->cache_size;
>>>
>>>  	/* cache is not enabled or single consumer */
>>> -	if (unlikely(cache_size == 0 || is_mc == 0 || n >= cache_size))
>>> +	if (unlikely(cache_size == 0 || is_mc == 0 ||
>>> +		     n >= cache_size || lcore_id >= RTE_MAX_LCORE))
>>>  		goto ring_dequeue;
>>>
>>>  	cache = &mp->local_cache[lcore_id];
>>>
>>
>> What is the performance impact of adding this test?
> [LCM] By perf in unit test, it's almost the same. But haven't measure EAL thread and non-EAL thread share the same mempool.


When you say "unit test", are you talking about mempool tests from
"make test"? Do you have some numbers to share?

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 01/17] eal: add cpuset into per EAL thread lcore_config
  2015-02-09 17:06               ` Olivier MATZ
  2015-02-09 17:37                 ` Ananyev, Konstantin
@ 2015-02-10  0:45                 ` Liang, Cunming
  1 sibling, 0 replies; 253+ messages in thread
From: Liang, Cunming @ 2015-02-10  0:45 UTC (permalink / raw)
  To: Olivier MATZ, dev



> -----Original Message-----
> From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> Sent: Tuesday, February 10, 2015 1:07 AM
> To: Liang, Cunming; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4 01/17] eal: add cpuset into per EAL thread
> lcore_config
> 
> Hi,
> 
> On 02/09/2015 12:33 PM, Liang, Cunming wrote:
> >> On 02/02/2015 03:02 AM, Cunming Liang wrote:
> >>> The patch adds 'cpuset' into per-lcore configure 'lcore_config[]',
> >>> as the lcore no longer always 1:1 pinning with physical cpu.
> >>> The lcore now stands for a EAL thread rather than a logical cpu.
> >>>
> >>> It doesn't change the default behavior of 1:1 mapping, but allows to
> >>> affinity the EAL thread to multiple cpus.
> >>>
> >>> [...]
> >>> diff --git a/lib/librte_eal/bsdapp/eal/eal_memory.c
> >> b/lib/librte_eal/bsdapp/eal/eal_memory.c
> >>> index 65ee87d..a34d500 100644
> >>> --- a/lib/librte_eal/bsdapp/eal/eal_memory.c
> >>> +++ b/lib/librte_eal/bsdapp/eal/eal_memory.c
> >>> @@ -45,6 +45,8 @@
> >>>  #include "eal_internal_cfg.h"
> >>>  #include "eal_filesystem.h"
> >>>
> >>> +/* avoid re-defined against with freebsd header */
> >>> +#undef PAGE_SIZE
> >>>  #define PAGE_SIZE (sysconf(_SC_PAGESIZE))
> >>
> >> I don't see the link with the patch. Should this go somewhere else?
> 
> Maybe you missed this one.
[LCM] Yes, I missed this one. I agree to move to a separate one and remove undef but rename the PAGE_SIZE to EAL_PAGE_SIZE.
> 
> 
> >>> diff --git a/lib/librte_eal/common/include/rte_lcore.h
> >> b/lib/librte_eal/common/include/rte_lcore.h
> >>> index 49b2c03..4c7d6bb 100644
> >>> --- a/lib/librte_eal/common/include/rte_lcore.h
> >>> +++ b/lib/librte_eal/common/include/rte_lcore.h
> >>> @@ -50,6 +50,13 @@ extern "C" {
> >>>
> >>>  #define LCORE_ID_ANY -1    /**< Any lcore. */
> >>>
> >>> +#if defined(__linux__)
> >>> +	typedef	cpu_set_t rte_cpuset_t;
> >>> +#elif defined(__FreeBSD__)
> >>> +#include <pthread_np.h>
> >>> +	typedef cpuset_t rte_cpuset_t;
> >>> +#endif
> >>> +
> >>
> >> Should we also define RTE_CPU_SETSIZE?
> >> For linux, should <sched.h> be included?
> > [LCM] It uses the fix size cpuset, won't use CPU_ALLOC() to get the pointer of
> cpuset.
> > The RTE_CPU_SETSIZE always equal to sizeof(rte_cpuset_t).
> 
> The advantage of using CPU_ALLOC() is to avoid issues when the number
> of core will be higher than 1024. I agree it's probably a bit early
> to think about this, but it could happen soon :)
> 
> 
> >> If I understand well, after the patch series, the user of
> >> rte_thread_set_affinity() and rte_thread_get_affinity() are
> >> supposed to use the macros from sched.h to access to this
> >> cpuset parameter. So I'm wondering if it's not better to
> >> use cpu_set_t from libc instead of redefining rte_cpuset_t.
> >>
> >> To reword my question: what is the purpose of redefining
> >> cpu_set_t in rte_cpuset_t if we still need to use all the
> >> libc API to access to it?
> > [LCM] In linux the type is *cpu_set_t*, but in freebsd it's *cpuset_t*.
> > The purpose of *rte_cpuset_t* is to make the consistent type definition in EAL,
> and to avoid lots of #ifdef for this diff.
> > In either linux or freebsd, it still can use the MACRO in libc to set the
> rte_cpuset_t.
> 
> OK, it makes sense then. I did not notice the difference between linux
> and bsd.

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 03/17] eal: fix wrong strnlen() return value in 32bit icc
  2015-02-09 17:13               ` Olivier MATZ
@ 2015-02-10  0:54                 ` Liang, Cunming
  0 siblings, 0 replies; 253+ messages in thread
From: Liang, Cunming @ 2015-02-10  0:54 UTC (permalink / raw)
  To: Olivier MATZ, dev



> -----Original Message-----
> From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> Sent: Tuesday, February 10, 2015 1:13 AM
> To: Liang, Cunming; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4 03/17] eal: fix wrong strnlen() return value in
> 32bit icc
> 
> Hi,
> 
> On 02/09/2015 12:57 PM, Liang, Cunming wrote:
> >>> @@ -469,7 +469,7 @@ eal_parse_lcores(const char *lcores)
> >>>  	/* Remove all blank characters ahead and after */
> >>>  	while (isblank(*lcores))
> >>>  		lcores++;
> >>> -	i = strnlen(lcores, sysconf(_SC_ARG_MAX));
> >>> +	i = strnlen(lcores, PATH_MAX);
> >>>  	while ((i > 0) && isblank(lcores[i - 1]))
> >>>  		i--;
> >>>
> >>>
> >>
> >> I think PATH_MAX is not equivalent to _SC_ARG_MAX.
> >>
> >> But the main question is: why do we need to use strnlen() here instead
> >> of strlen? We can expect that argv[] pointers are always nul-terminated.
> >> Replacing them by strlen() would probably also solve the icc issue.
> > [LCM] You're right, here strlen() also solve icc issue and no risk for argv[].
> > But follows practice suggestion, keeping using those with 'n' function in DPDK is
> not bad.
> > There's additional two reason to keep strnlen and PATH_MAX.
> > 1. PATH_MAX is defined as 4096 which is enough as our input. It doesn't matter
> to be _SC_ARG_MAX or not.
> 
> PATH_MAX is 4096 but it's not related to the maximum argument length.
> 
> > 2. strnlen and PATH_MAX already used in eal_parse_coremask, to keep the
> style consistent in '-l' and '--lcores'.
> 
> I don't think it's a valid argument.
> 
> What is the problem of using strlen()? It looks it solves all the
> issues. Using strlen on valid strings is not a security issue.
[LCM] All right, I buy in your point.
> 
> 
> Regards,
> Olivier

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 05/17] eal: new TLS definition and API declaration
  2015-02-09 17:26               ` Olivier MATZ
@ 2015-02-10  2:45                 ` Liang, Cunming
  0 siblings, 0 replies; 253+ messages in thread
From: Liang, Cunming @ 2015-02-10  2:45 UTC (permalink / raw)
  To: Olivier MATZ, dev



> -----Original Message-----
> From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> Sent: Tuesday, February 10, 2015 1:26 AM
> To: Liang, Cunming; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4 05/17] eal: new TLS definition and API
> declaration
> 
> Hi,
> 
> On 02/09/2015 01:45 PM, Liang, Cunming wrote:
> >>> +/**
> >>> + * Dump the current pthread cpuset.
> >>> + * This function is private to EAL.
> >>> + *
> >>> + * @param str
> >>> + *   The string buffer the cpuset will dump to.
> >>> + * @param size
> >>> + *   The string buffer size.
> >>> + */
> >>> +#define CPU_STR_LEN            256
> >>> +void
> >>> +eal_thread_dump_affinity(char str[], unsigned size);
> >>
> >> Although it's equivalent for function arguments, I think "char *str" is
> >> usually preferred over "char str[]". See for instance in snprintf() or
> >> fgets().
> > [LCM] Accept.
> >>
> >> What is the purpose of CPU_STR_LEN?
> > [LCM] For default quick reference for str[] definition used in dump_affinity()
> 
> So the API comment of the function is not placed at the right
> place.
> 
> A comment "Default buffer size to use with eal_thread_dump_affinity()"
> should be added above CPU_STR_LEN. Also, it could be renamed in
> RTE_CPU_STR_LEN or RTE_CPU_AFFINITY_STR_LEN.
[LCM] Got you.
> 
> 
> 
> >>> @@ -80,7 +81,9 @@ struct lcore_config {
> >>>   */
> >>>  extern struct lcore_config lcore_config[RTE_MAX_LCORE];
> >>>
> >>> -RTE_DECLARE_PER_LCORE(unsigned, _lcore_id); /**< Per core "core id". */
> >>> +RTE_DECLARE_PER_LCORE(unsigned, _lcore_id);  /**< Per thread "lcore id".
> >> */
> >>> +RTE_DECLARE_PER_LCORE(unsigned, _socket_id); /**< Per thread "socket
> id".
> >> */
> >>> +RTE_DECLARE_PER_LCORE(rte_cpuset_t, _cpuset); /**< Per thread
> "cpuset".
> >> */
> >>>
> >>>  /**
> >>>   * Return the ID of the execution unit we are running on.
> >>> @@ -146,7 +149,7 @@ rte_lcore_index(int lcore_id)
> >>>  static inline unsigned
> >>>  rte_socket_id(void)
> >>>  {
> >>> -	return lcore_config[rte_lcore_id()].socket_id;
> >>> +	return RTE_PER_LCORE(_socket_id);
> >>>  }
> >>
> >> I don't see where the _socket_id variable is assigned. I think there
> >> is probably an issue with the splitting of the patches.
> > [LCM] The value initializes as SOCKET_ID_ANY when RTE_DEFINE_PER_LCORE().
> > And updated in eal_thread_set_affinity() for EAL thread and
> rte_thread_set_affinity() for non-EAL thread.
> 
> This is done in a later patches:
> 
> "eal: set _lcore_id and _socket_id to (-1) by default"
> "eal: apply affinity of EAL thread by assigned cpuset"
> 
> That's why I said there is probably an issue with the ordering
> of the patches as these values are used here but initialized
> later in the series.
[LCM] Will reorder them in next version.

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 06/17] eal: add eal_common_thread.c for common thread API
  2015-02-09 17:30               ` Olivier MATZ
@ 2015-02-10  2:46                 ` Liang, Cunming
  0 siblings, 0 replies; 253+ messages in thread
From: Liang, Cunming @ 2015-02-10  2:46 UTC (permalink / raw)
  To: Olivier MATZ, dev



> -----Original Message-----
> From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> Sent: Tuesday, February 10, 2015 1:30 AM
> To: Liang, Cunming; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4 06/17] eal: add eal_common_thread.c for
> common thread API
> 
> Hi,
> 
> On 02/09/2015 02:12 PM, Liang, Cunming wrote:
> >>> +int
> >>> +rte_thread_get_affinity(rte_cpuset_t *cpusetp)
> >>> +{
> >>> +	if (!cpusetp)
> >>> +		return -1;
> >>
> >> Same here. This is the only reason why rte_thread_get_affinity() could
> >> fail. Removing this test would allow to change the API to return void
> >> instead. It will avoid a useless test below in
> >> eal_thread_dump_affinity().
> > [LCM] The cpusetp is used as destination of memcpy and the function suppose
> an EAL API.
> > I don't think it's a good idea to remove the check, do you ?
> 
> I know we often have debate on this subject on the list. My personal
> opinion is that checking a NULL pointer in these cases is useless
> because the user is suppose to give a non-NULL pointer. Returning
> an error will result in managing an error for something that cannot
> happen.
> 
> On the other hand, adding an assert() (or the dpdk equivalent) would
> be a good idea.
[LCM] Ok, I see. Will update it.
> 
> 
> >>
> >>> +
> >>> +	rte_memcpy(cpusetp, &RTE_PER_LCORE(_cpuset),
> >>> +		   sizeof(rte_cpuset_t));
> >>> +
> >>> +	return 0;
> >>> +}
> >>> +
> >>> +void
> >>> +eal_thread_dump_affinity(char str[], unsigned size)
> >>> +{
> >>> +	rte_cpuset_t cpuset;
> >>> +	unsigned cpu;
> >>> +	int ret;
> >>> +	unsigned int out = 0;
> >>> +
> >>> +	if (rte_thread_get_affinity(&cpuset) < 0) {
> >>> +		str[0] = '\0';
> >>> +		return;
> >>> +	}
> >>
> >> This one could be removed it the (== NULL) test is removed.
> >>
> >>> +
> >>> +	for (cpu = 0; cpu < RTE_MAX_LCORE; cpu++) {
> >>> +		if (!CPU_ISSET(cpu, &cpuset))
> >>> +			continue;
> >>> +
> >>> +		ret = snprintf(str + out,
> >>> +			       size - out, "%u,", cpu);
> >>> +		if (ret < 0 || (unsigned)ret >= size - out)
> >>> +			break;
> >>
> >> On the contrary, I think here returning an error to the user
> >> would be useful so he can knows that the dump is not complete.
> > [LCM] accept.
> >>
> >>
> >> Regards,
> >> Olivier

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 08/17] eal: apply affinity of EAL thread by assigned cpuset
  2015-02-09 17:36               ` Olivier MATZ
@ 2015-02-10  2:51                 ` Liang, Cunming
  0 siblings, 0 replies; 253+ messages in thread
From: Liang, Cunming @ 2015-02-10  2:51 UTC (permalink / raw)
  To: Olivier MATZ, dev



> -----Original Message-----
> From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> Sent: Tuesday, February 10, 2015 1:37 AM
> To: Liang, Cunming; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4 08/17] eal: apply affinity of EAL thread by
> assigned cpuset
> 
> Hi,
> 
> On 02/09/2015 02:48 PM, Liang, Cunming wrote:
> >> -----Original Message-----
> >> From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> >> Sent: Monday, February 09, 2015 4:01 AM
> >> To: Liang, Cunming; dev@dpdk.org
> >> Subject: Re: [dpdk-dev] [PATCH v4 08/17] eal: apply affinity of EAL thread by
> >> assigned cpuset
> >>
> >> Hi,
> >>
> >> On 02/02/2015 03:02 AM, Cunming Liang wrote:
> >>> EAL threads use assigned cpuset to set core affinity during startup.
> >>> It keeps 1:1 mapping, if no '--lcores' option is used.
> >>>
> >>> [...]
> >>>
> >>>  lib/librte_eal/bsdapp/eal/eal.c          | 13 ++++---
> >>>  lib/librte_eal/bsdapp/eal/eal_thread.c   | 63 +++++++++---------------------
> >>>  lib/librte_eal/linuxapp/eal/eal.c        |  7 +++-
> >>>  lib/librte_eal/linuxapp/eal/eal_thread.c | 67 +++++++++++---------------------
> >>>  4 files changed, 54 insertions(+), 96 deletions(-)
> >>>
> >>> diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
> >>> index 69f3c03..98c5a83 100644
> >>> --- a/lib/librte_eal/bsdapp/eal/eal.c
> >>> +++ b/lib/librte_eal/bsdapp/eal/eal.c
> >>> @@ -432,6 +432,7 @@ rte_eal_init(int argc, char **argv)
> >>>  	int i, fctret, ret;
> >>>  	pthread_t thread_id;
> >>>  	static rte_atomic32_t run_once = RTE_ATOMIC32_INIT(0);
> >>> +	char cpuset[CPU_STR_LEN];
> >>>
> >>>  	if (!rte_atomic32_test_and_set(&run_once))
> >>>  		return -1;
> >>> @@ -502,13 +503,17 @@ rte_eal_init(int argc, char **argv)
> >>>  	if (rte_eal_pci_init() < 0)
> >>>  		rte_panic("Cannot init PCI\n");
> >>>
> >>> -	RTE_LOG(DEBUG, EAL, "Master core %u is ready (tid=%p)\n",
> >>> -		rte_config.master_lcore, thread_id);
> >>> -
> >>>  	eal_check_mem_on_local_socket();
> >>>
> >>>  	rte_eal_mcfg_complete();
> >>>
> >>> +	eal_thread_init_master(rte_config.master_lcore);
> >>> +
> >>> +	eal_thread_dump_affinity(cpuset, CPU_STR_LEN);
> >>> +
> >>> +	RTE_LOG(DEBUG, EAL, "Master lcore %u is ready
> (tid=%p;cpuset=[%s])\n",
> >>> +		rte_config.master_lcore, thread_id, cpuset);
> >>> +
> >>>  	if (rte_eal_dev_init() < 0)
> >>>  		rte_panic("Cannot init pmd devices\n");
> >>>
> >>> @@ -532,8 +537,6 @@ rte_eal_init(int argc, char **argv)
> >>>  			rte_panic("Cannot create thread\n");
> >>>  	}
> >>>
> >>> -	eal_thread_init_master(rte_config.master_lcore);
> >>> -
> >>>  	/*
> >>>  	 * Launch a dummy function on all slave lcores, so that master lcore
> >>>  	 * knows they are all ready when this function returns.
> >>
> >> I wonder if changing this may have an impact on third-party drivers
> >> that already use a management thread. Before the patch, the init()
> >> function of the external library was called with default affinities,
> >> and now it's called with the affinity from master lcore.
> >>
> >> I think it should at least be noticed in the commit log.
> >>
> >> Why are you doing this change? (I don't say it's a bad change, but
> >> I don't understand why you are doing it here)
> > [LCM] To be honest, the main purpose is I don't found any reason to have
> linuxapp and freebsdapp in different init sequence.
> > I means in linux it init_master before dev_init(), but in freebsd it reverse.
> 
> 
> I agree that's something we should fix.
> 
> 
> > And as the default value of TLS already changes, if dev_init() first and using
> those TLS, the result will be not in an EAL thread.
> > But actually they're in the EAL master thread. So I prefer to do the change
> follows linuxapp sequence.
> 
> That makes sense. Is it possible to have this reordering in a separate
> patch? The title could be
> "eal: standardize init sequence between linux and bsd"
[LCM] Agree.
> 
> 
> 
> >>
> >>
> >>> diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c
> >> b/lib/librte_eal/bsdapp/eal/eal_thread.c
> >>> index d0c077b..5b16302 100644
> >>> --- a/lib/librte_eal/bsdapp/eal/eal_thread.c
> >>> +++ b/lib/librte_eal/bsdapp/eal/eal_thread.c
> >>> @@ -103,55 +103,27 @@ eal_thread_set_affinity(void)
> >>>  {
> >>>  	int s;
> >>>  	pthread_t thread;
> >>> -
> >>> -/*
> >>> - * According to the section VERSIONS of the CPU_ALLOC man page:
> >>> - *
> >>> - * The CPU_ZERO(), CPU_SET(), CPU_CLR(), and CPU_ISSET() macros were
> >> added
> >>> - * in glibc 2.3.3.
> >>> - *
> >>> - * CPU_COUNT() first appeared in glibc 2.6.
> >>> - *
> >>> - * CPU_AND(),     CPU_OR(),     CPU_XOR(),    CPU_EQUAL(),
> >> CPU_ALLOC(),
> >>> - * CPU_ALLOC_SIZE(), CPU_FREE(), CPU_ZERO_S(),  CPU_SET_S(),
> >> CPU_CLR_S(),
> >>> - * CPU_ISSET_S(),  CPU_AND_S(), CPU_OR_S(), CPU_XOR_S(), and
> >> CPU_EQUAL_S()
> >>> - * first appeared in glibc 2.7.
> >>> - */
> >>> -#if defined(CPU_ALLOC)
> >>> -	size_t size;
> >>> -	cpu_set_t *cpusetp;
> >>> -
> >>> -	cpusetp = CPU_ALLOC(RTE_MAX_LCORE);
> >>> -	if (cpusetp == NULL) {
> >>> -		RTE_LOG(ERR, EAL, "CPU_ALLOC failed\n");
> >>> -		return -1;
> >>> -	}
> >>> -
> >>> -	size = CPU_ALLOC_SIZE(RTE_MAX_LCORE);
> >>> -	CPU_ZERO_S(size, cpusetp);
> >>> -	CPU_SET_S(rte_lcore_id(), size, cpusetp);
> >>> +	unsigned lcore_id = rte_lcore_id();
> >>>
> >>>  	thread = pthread_self();
> >>> -	s = pthread_setaffinity_np(thread, size, cpusetp);
> >>> +	s = pthread_setaffinity_np(thread, sizeof(cpuset_t),
> >>> +				   &lcore_config[lcore_id].cpuset);
> >>>  	if (s != 0) {
> >>>  		RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
> >>> -		CPU_FREE(cpusetp);
> >>>  		return -1;
> >>>  	}
> >>>
> >>> -	CPU_FREE(cpusetp);
> >>> -#else /* CPU_ALLOC */
> >>> -	cpuset_t cpuset;
> >>> -	CPU_ZERO( &cpuset );
> >>> -	CPU_SET( rte_lcore_id(), &cpuset );
> >>> +	/* acquire system unique id  */
> >>> +	rte_gettid();
> >>
> >> As suggested in the previous patch, I think having rte_init_tid() would
> >> be clearer here.
> > [LCM] Sorry, I didn't get your [PATCH v4 07/17] comments, probably the
> mailbox issue.
> > Do you suggest to have a rte_init_tid() but not do syscall on the first time ?
> > Any benefit, rte_gettid() looks like more simple and straight forward.
> 
> I think the mail was properly sent, you can see it here:
> http://dpdk.org/ml/archives/dev/2015-February/012556.html
> 
> Usually, "get" functions return a value and have no side effects.
> "init" functions return nothing (or an error code) but have a
> side effect which is to initialize an internal state.
[LCM] Thanks, agree your point, will update on v5.
> 
> 
> >>> +
> >>> +	/* store socket_id in TLS for quick access */
> >>> +	RTE_PER_LCORE(_socket_id) =
> >>> +		eal_cpuset_socket_id(&lcore_config[lcore_id].cpuset);
> >>> +
> >>> +	CPU_COPY(&lcore_config[lcore_id].cpuset, &RTE_PER_LCORE(_cpuset));
> >>> +
> >>> +	lcore_config[lcore_id].socket_id = RTE_PER_LCORE(_socket_id);
> >>>
> >>> -	thread = pthread_self();
> >>> -	s = pthread_setaffinity_np(thread, sizeof( cpuset ), &cpuset);
> >>> -	if (s != 0) {
> >>> -		RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
> >>> -		return -1;
> >>> -	}
> >>> -#endif
> >>
> >> You are removing a lot of code that was using CPU_ALLOC().
> >> Are we sure that the cpuset_t type is large enough to store all the
> >> CPUs?
> >>
> >> It looks the current value of CPU_SETSIZE is 1024 now, but I wonder
> >> if this code was written when this value was lower. Could you check if
> >> it can happen today (maybe with an old libc)? A problem can occur if
> >> the size of cpuset_t is lower that the size of RTE_MAX_LCORE.
> > [LCM] I found actually the MACRO is not just for support CPU_ALLOC(), but for
> linux or freebsd.
> > In freebsdapp, there's no CPU_ALLOC defined, it use fixed width *cpuset_t*.
> > In linuxapp, there's CPU_ALLOC defined, it use cpu_set_t* and dynamic
> CPU_ALLOC(RTE_MAX_LCORE).
> > But actually RTE_MAX_LCORE < 1024(sizeof(cpu_set_t)).
> > After using rte_cpuset_t, there's no additional reason to use CPU_ALLOC only
> for linuxapp and choose a small but dynamic width.
> 
> I did a quick search on google, and it seems CPU_SETSIZE is 1024
> for a long time. So you are right, there is probably no reason to
> keep CPU_ALLOC(). As I said in a previous mail, it could be useful
> in the future when the number of CPUs will reach 1024, but we have
> some time to handle this.
[LCM]  Ok, thanks.
> 
> 
> 

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 12/17] eal: set _lcore_id and _socket_id to (-1) by default
  2015-02-09 17:49               ` Olivier MATZ
@ 2015-02-10  2:53                 ` Liang, Cunming
  2015-02-10 11:15                   ` Ananyev, Konstantin
  0 siblings, 1 reply; 253+ messages in thread
From: Liang, Cunming @ 2015-02-10  2:53 UTC (permalink / raw)
  To: Olivier MATZ, dev



> -----Original Message-----
> From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> Sent: Tuesday, February 10, 2015 1:49 AM
> To: Liang, Cunming; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4 12/17] eal: set _lcore_id and _socket_id to (-1)
> by default
> 
> Hi,
> 
> On 02/09/2015 03:24 PM, Liang, Cunming wrote:
> >>> --- a/lib/librte_eal/linuxapp/eal/eal_thread.c
> >>> +++ b/lib/librte_eal/linuxapp/eal/eal_thread.c
> >>> @@ -57,8 +57,8 @@
> >>>  #include "eal_private.h"
> >>>  #include "eal_thread.h"
> >>>
> >>> -RTE_DEFINE_PER_LCORE(unsigned, _lcore_id);
> >>> -RTE_DEFINE_PER_LCORE(unsigned, _socket_id);
> >>> +RTE_DEFINE_PER_LCORE(unsigned, _lcore_id) = (unsigned)LCORE_ID_ANY;
> >>> +RTE_DEFINE_PER_LCORE(unsigned, _socket_id) =
> (unsigned)SOCKET_ID_ANY;
> >>>  RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset);
> >>
> >> As far as I understand, now a rte_lcore_id() can return LCORE_ID_ANY.
> >> This should be modified in the rte_lcore_id() API comments.
> >>
> >> Same for rte_socket_id().
> > [LCM] accept.
> >>
> >> I also wonder if the API of these functions should be modified to
> >> return an int instead of an unsigned as LCORE_ID_ANY is -1.
> > [LCM] I prefer not change the API definition. (unsigned)LCORE_ID_ANY already
> used before.
> 
> OK
> 
> And what about directly defining the following?
> 
> #define LCORE_ID_ANY ((unsigned)-1)
> 
> 
> It would avoid the casts.
[LCM] Good point, will update it.

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 11/17] log: fix the gap to support non-EAL thread
  2015-02-09 17:44               ` Olivier MATZ
@ 2015-02-10  2:56                 ` Liang, Cunming
  0 siblings, 0 replies; 253+ messages in thread
From: Liang, Cunming @ 2015-02-10  2:56 UTC (permalink / raw)
  To: Olivier MATZ, dev



> -----Original Message-----
> From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> Sent: Tuesday, February 10, 2015 1:45 AM
> To: Liang, Cunming; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4 11/17] log: fix the gap to support non-EAL
> thread
> 
> Hi,
> 
> On 02/09/2015 03:19 PM, Liang, Cunming wrote:
> >>> --- a/lib/librte_eal/common/include/rte_log.h
> >>> +++ b/lib/librte_eal/common/include/rte_log.h
> >>> @@ -144,6 +144,11 @@ uint32_t rte_get_log_level(void);
> >>>  void rte_set_log_type(uint32_t type, int enable);
> >>>
> >>>  /**
> >>> + * Get the global log type.
> >>> + */
> >>> +uint32_t rte_get_log_type(void);
> >>> +
> >>> +/**
> >>>   * Get the current loglevel for the message being processed.
> >>>   *
> >>>   * Before calling the user-defined stream for logging, the log
> >>>
> >>
> >> Wouldn't it be better to change the variable:
> >> static struct log_cur_msg log_cur_msg[RTE_MAX_LCORE];
> >> into a pthread (tls) variable?
> >>
> >> With your patch, the log level and log type are not saved for
> >> non-EAL threads. If TLS were used, I think it would work in any case.
> > [LCM] Good point. But for this patch set, still suppose not involve big impact to
> EAL thread.
> > For improve non-EAL thread, we'll have a separate patch set for it.
> 
> OK, that's fine
> 
> Will it be for 2.0 or later?
[LCM] Will be in 2.1 I suppose, together with the patch for mempool cache to support non-EAL thread.

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 14/17] mempool: add support to non-EAL thread
  2015-02-09 17:52               ` Olivier MATZ
@ 2015-02-10  2:57                 ` Liang, Cunming
  0 siblings, 0 replies; 253+ messages in thread
From: Liang, Cunming @ 2015-02-10  2:57 UTC (permalink / raw)
  To: Olivier MATZ, dev



> -----Original Message-----
> From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> Sent: Tuesday, February 10, 2015 1:52 AM
> To: Liang, Cunming; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4 14/17] mempool: add support to non-EAL
> thread
> 
> Hi,
> 
> On 02/09/2015 03:41 PM, Liang, Cunming wrote:
> >>>  #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
> >>> -#define __MEMPOOL_STAT_ADD(mp, name, n) do {			\
> >>> -		unsigned __lcore_id = rte_lcore_id();		\
> >>> -		mp->stats[__lcore_id].name##_objs += n;		\
> >>> -		mp->stats[__lcore_id].name##_bulk += 1;		\
> >>> +#define __MEMPOOL_STAT_ADD(mp, name, n) do {                    \
> >>> +		unsigned __lcore_id = rte_lcore_id();           \
> >>> +		if (__lcore_id < RTE_MAX_LCORE) {               \
> >>> +			mp->stats[__lcore_id].name##_objs += n;	\
> >>> +			mp->stats[__lcore_id].name##_bulk += 1;	\
> >>> +		}                                               \
> >>
> >> Does it mean that we have no statistics for non-EAL threads?
> >> (same question for rings and timers in the next patches)
> > [LCM] Yes, it is in this patch set, mainly focus on EAL thread and make sure no
> running issue on non-EAL thread.
> > For full non-EAL function, will have other patch set to enhance non-EAL thread
> as the 2nd step.
> 
> OK
> 
> >>> @@ -952,7 +955,8 @@ __mempool_get_bulk(struct rte_mempool *mp,
> void
> >> **obj_table,
> >>>  	uint32_t cache_size = mp->cache_size;
> >>>
> >>>  	/* cache is not enabled or single consumer */
> >>> -	if (unlikely(cache_size == 0 || is_mc == 0 || n >= cache_size))
> >>> +	if (unlikely(cache_size == 0 || is_mc == 0 ||
> >>> +		     n >= cache_size || lcore_id >= RTE_MAX_LCORE))
> >>>  		goto ring_dequeue;
> >>>
> >>>  	cache = &mp->local_cache[lcore_id];
> >>>
> >>
> >> What is the performance impact of adding this test?
> > [LCM] By perf in unit test, it's almost the same. But haven't measure EAL thread
> and non-EAL thread share the same mempool.
> 
> 
> When you say "unit test", are you talking about mempool tests from
> "make test"? Do you have some numbers to share?
[LCM] I means DPDK app/test, run mempool_perf_test. Will add numbers on v5.

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 07/17] eal: add rte_gettid() to acquire unique system tid
  2015-02-08 20:00           ` Olivier MATZ
@ 2015-02-10  6:57             ` Liang, Cunming
  2015-02-10 17:16               ` Olivier MATZ
  0 siblings, 1 reply; 253+ messages in thread
From: Liang, Cunming @ 2015-02-10  6:57 UTC (permalink / raw)
  To: Olivier MATZ, dev



> -----Original Message-----
> From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> Sent: Monday, February 09, 2015 4:01 AM
> To: Liang, Cunming; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4 07/17] eal: add rte_gettid() to acquire unique
> system tid
> 
> Hi,
> 
> On 02/02/2015 03:02 AM, Cunming Liang wrote:
> > The rte_gettid() wraps the linux and freebsd syscall gettid().
> > It provides a persistent unique thread id for the calling thread.
> > It will save the unique id in TLS on the first time.
> >
> > [...]
> >
> > +/**
> > + * A wrap API for syscall gettid.
> > + *
> > + * @return
> > + *   On success, returns the thread ID of calling process.
> > + *   It always successful.
> > + */
> > +int rte_sys_gettid(void);
> > +
> > +/**
> > + * Get system unique thread id.
> > + *
> > + * @return
> > + *   On success, returns the thread ID of calling process.
> > + *   It always successful.
> > + */
> > +static inline int rte_gettid(void)
> > +{
> > +	static RTE_DEFINE_PER_LCORE(int, _thread_id) = -1;
> > +	if (RTE_PER_LCORE(_thread_id) == -1)
> > +		RTE_PER_LCORE(_thread_id) = rte_sys_gettid();
> > +	return RTE_PER_LCORE(_thread_id);
> > +}
> 
> Instead of doing the test each time rte_gettid() is called, why not
> having 2 functions:
>   rte_init_tid() -> assign the per_lcore variable
>   rte_gettid() -> return the per_lcore variable

[LCM] The rte_gettid() mainly used in recursive spinlock.
For non-EAL thread, we don't expect new user thread has to explicit call something.
The purpose to call it in EAL thread init, is to lower down the overhead of the first calling for EAL thread.

> 
> 
> 
> Regards,
> Olivier

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 12/17] eal: set _lcore_id and _socket_id to (-1) by default
  2015-02-10  2:53                 ` Liang, Cunming
@ 2015-02-10 11:15                   ` Ananyev, Konstantin
  0 siblings, 0 replies; 253+ messages in thread
From: Ananyev, Konstantin @ 2015-02-10 11:15 UTC (permalink / raw)
  To: Liang, Cunming, Olivier MATZ, dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Liang, Cunming
> Sent: Tuesday, February 10, 2015 2:54 AM
> To: Olivier MATZ; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4 12/17] eal: set _lcore_id and _socket_id to (-1) by default
> 
> 
> 
> > -----Original Message-----
> > From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> > Sent: Tuesday, February 10, 2015 1:49 AM
> > To: Liang, Cunming; dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v4 12/17] eal: set _lcore_id and _socket_id to (-1)
> > by default
> >
> > Hi,
> >
> > On 02/09/2015 03:24 PM, Liang, Cunming wrote:
> > >>> --- a/lib/librte_eal/linuxapp/eal/eal_thread.c
> > >>> +++ b/lib/librte_eal/linuxapp/eal/eal_thread.c
> > >>> @@ -57,8 +57,8 @@
> > >>>  #include "eal_private.h"
> > >>>  #include "eal_thread.h"
> > >>>
> > >>> -RTE_DEFINE_PER_LCORE(unsigned, _lcore_id);
> > >>> -RTE_DEFINE_PER_LCORE(unsigned, _socket_id);
> > >>> +RTE_DEFINE_PER_LCORE(unsigned, _lcore_id) = (unsigned)LCORE_ID_ANY;
> > >>> +RTE_DEFINE_PER_LCORE(unsigned, _socket_id) =
> > (unsigned)SOCKET_ID_ANY;
> > >>>  RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset);
> > >>
> > >> As far as I understand, now a rte_lcore_id() can return LCORE_ID_ANY.
> > >> This should be modified in the rte_lcore_id() API comments.
> > >>
> > >> Same for rte_socket_id().
> > > [LCM] accept.
> > >>
> > >> I also wonder if the API of these functions should be modified to
> > >> return an int instead of an unsigned as LCORE_ID_ANY is -1.
> > > [LCM] I prefer not change the API definition. (unsigned)LCORE_ID_ANY already
> > used before.
> >
> > OK
> >
> > And what about directly defining the following?
> >
> > #define LCORE_ID_ANY ((unsigned)-1)
> >
> >
> > It would avoid the casts.
> [LCM] Good point, will update it.

UINT32_MAX ?

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 16/17] ring: add sched_yield to avoid spin forever
  2015-02-09 15:43             ` Ananyev, Konstantin
@ 2015-02-10 16:53               ` Olivier MATZ
  0 siblings, 0 replies; 253+ messages in thread
From: Olivier MATZ @ 2015-02-10 16:53 UTC (permalink / raw)
  To: Ananyev, Konstantin, Liang, Cunming, dev

Hi Konstantin,

On 02/09/2015 04:43 PM, Ananyev, Konstantin wrote:
>> The ring library was designed with the assumption that the code is not
>> preemptable. The code is lock-less but not wait-less. Actually, if the
>> code is preempted at a bad moment, it can spin forever until it's
>> unscheduled.
>>
>> I wonder if adding a sched_yield() may not penalize the current
>> implementations that only use one pthread per core? Even if there
>> is only one pthread in the scheduler queue for this CPU, calling
>> the scheduler code may cost thousands of cycles.
>>
>> Also, where does this value "RTE_RING_PAUSE_REP 0x100" comes from?
>> Why 0x100 is better than 42 or than 10000?
>
> The idea was to have something few times bigger than actual number
> active cores in the system, to minimise chance of  a sched_yield() being called
> for the case when we have one thread per physical core.
> My thought was that having that many repeats would make such chance neglectable.
> Though, right now, I don't have any data to back it up.
>
>> I think it could be good to check if there is a performance impact
>> with this change, especially where there is a lot of contention on
>> the ring. If it has an impact, what about adding a compile or runtime
>> option?
>
> Good idea, probably we should make RTE_RING_PAUSE_REP  configuration option
> and let say avoid emitting ' sched_yield();' at all, if  RTE_RING_PAUSE_REP == 0.

Yes, it looks like a good compromise.

Olivier

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 07/17] eal: add rte_gettid() to acquire unique system tid
  2015-02-10  6:57             ` Liang, Cunming
@ 2015-02-10 17:16               ` Olivier MATZ
  0 siblings, 0 replies; 253+ messages in thread
From: Olivier MATZ @ 2015-02-10 17:16 UTC (permalink / raw)
  To: Liang, Cunming, dev

Hi,

On 02/10/2015 07:57 AM, Liang, Cunming wrote:
>>> +/**
>>> + * Get system unique thread id.
>>> + *
>>> + * @return
>>> + *   On success, returns the thread ID of calling process.
>>> + *   It always successful.
>>> + */
>>> +static inline int rte_gettid(void)
>>> +{
>>> +	static RTE_DEFINE_PER_LCORE(int, _thread_id) = -1;
>>> +	if (RTE_PER_LCORE(_thread_id) == -1)
>>> +		RTE_PER_LCORE(_thread_id) = rte_sys_gettid();
>>> +	return RTE_PER_LCORE(_thread_id);
>>> +}
>>
>> Instead of doing the test each time rte_gettid() is called, why not
>> having 2 functions:
>>    rte_init_tid() -> assign the per_lcore variable
>>    rte_gettid() -> return the per_lcore variable
>
> [LCM] The rte_gettid() mainly used in recursive spinlock.
> For non-EAL thread, we don't expect new user thread has to explicit call something.
> The purpose to call it in EAL thread init, is to lower down the overhead of the first calling for EAL thread.

Got it. So that's fine like you proposed.

Olivier

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 17/17] timer: add support to non-EAL thread
  2015-02-02  2:02         ` [dpdk-dev] [PATCH v4 17/17] timer: add support to non-EAL thread Cunming Liang
@ 2015-02-10 17:45           ` Olivier MATZ
  2015-02-11  6:25             ` Liang, Cunming
  0 siblings, 1 reply; 253+ messages in thread
From: Olivier MATZ @ 2015-02-10 17:45 UTC (permalink / raw)
  To: Cunming Liang, dev

Hi,

On 02/02/2015 03:02 AM, Cunming Liang wrote:
> Allow to setup timers only for EAL (lcore) threads (__lcore_id < MAX_LCORE_ID).
> E.g. – dynamically created thread will be able to reset/stop timer for lcore thread,
> but it will be not allowed to setup timer for itself or another non-lcore thread.
> rte_timer_manage() for non-lcore thread would simply do nothing and return straightway.
>
> Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> ---
>   lib/librte_timer/rte_timer.c | 40 +++++++++++++++++++++++++++++++---------
>   lib/librte_timer/rte_timer.h |  2 +-
>   2 files changed, 32 insertions(+), 10 deletions(-)
>
> diff --git a/lib/librte_timer/rte_timer.c b/lib/librte_timer/rte_timer.c
> index 269a992..601c159 100644
> --- a/lib/librte_timer/rte_timer.c
> +++ b/lib/librte_timer/rte_timer.c
> @@ -79,9 +79,10 @@ static struct priv_timer priv_timer[RTE_MAX_LCORE];
>
>   /* when debug is enabled, store some statistics */
>   #ifdef RTE_LIBRTE_TIMER_DEBUG
> -#define __TIMER_STAT_ADD(name, n) do {				\
> -		unsigned __lcore_id = rte_lcore_id();		\
> -		priv_timer[__lcore_id].stats.name += (n);	\
> +#define __TIMER_STAT_ADD(name, n) do {					\
> +		unsigned __lcore_id = rte_lcore_id();			\
> +		if (__lcore_id < RTE_MAX_LCORE)				\
> +			priv_timer[__lcore_id].stats.name += (n);	\
>   	} while(0)
>   #else
>   #define __TIMER_STAT_ADD(name, n) do {} while(0)
> @@ -127,15 +128,26 @@ timer_set_config_state(struct rte_timer *tim,
>   	unsigned lcore_id;
>
>   	lcore_id = rte_lcore_id();
> +	if (lcore_id >= RTE_MAX_LCORE)
> +		lcore_id = LCORE_ID_ANY;

Is this still valid?
In my understanding, rte_lcore_id() was returning the core id or
LCORE_ID_ANY if it's a non-EAL thread.

>
>   	/* wait that the timer is in correct status before update,
>   	 * and mark it as being configured */
>   	while (success == 0) {
>   		prev_status.u32 = tim->status.u32;
>
> +		/*
> +		 * prevent race condition of non-EAL threads
> +		 * to update the timer. When 'owner == LCORE_ID_ANY',
> +		 * it means updated by a non-EAL thread.
> +		 */
> +		if (lcore_id == (unsigned)LCORE_ID_ANY &&
> +		    (uint16_t)lcore_id == prev_status.owner)
> +			return -1;
> +

Are you sure this is required?

I think prev_status.owner can be LCORE_ID_ANY only in config state,
as a timer cannot be scheduled on a non-EAL thread. And there is
already a test that returns -1 if state is CONFIG.


>   		/* timer is running on another core, exit */
>   		if (prev_status.state == RTE_TIMER_RUNNING &&
> -		    (unsigned)prev_status.owner != lcore_id)
> +		    prev_status.owner != (uint16_t)lcore_id)
>   			return -1;
>
>   		/* timer is being configured on another core */
> @@ -366,9 +378,13 @@ __rte_timer_reset(struct rte_timer *tim, uint64_t expire,
>
>   	/* round robin for tim_lcore */
>   	if (tim_lcore == (unsigned)LCORE_ID_ANY) {
> -		tim_lcore = rte_get_next_lcore(priv_timer[lcore_id].prev_lcore,
> -					       0, 1);
> -		priv_timer[lcore_id].prev_lcore = tim_lcore;
> +		if (lcore_id < RTE_MAX_LCORE) {

if (lcore_id != LCORE_ID_ANY) ?


> +			tim_lcore = rte_get_next_lcore(
> +				priv_timer[lcore_id].prev_lcore,
> +				0, 1);
> +			priv_timer[lcore_id].prev_lcore = tim_lcore;
> +		} else
> +			tim_lcore = rte_get_next_lcore(LCORE_ID_ANY, 0, 1);

I think the following line:
tim_lcore = rte_get_next_lcore(LCORE_ID_ANY, 0, 1);
Will return the first enabled core.

Maybe using rte_get_master_lcore() is clearer?



>   	}
>
>   	/* wait that the timer is in correct status before update,
> @@ -378,7 +394,8 @@ __rte_timer_reset(struct rte_timer *tim, uint64_t expire,
>   		return -1;
>
>   	__TIMER_STAT_ADD(reset, 1);
> -	if (prev_status.state == RTE_TIMER_RUNNING) {
> +	if (prev_status.state == RTE_TIMER_RUNNING &&
> +	    lcore_id < RTE_MAX_LCORE) {

if (lcore_id != LCORE_ID_ANY) ?


>   		priv_timer[lcore_id].updated = 1;
>   	}
>
> @@ -455,7 +472,8 @@ rte_timer_stop(struct rte_timer *tim)
>   		return -1;
>
>   	__TIMER_STAT_ADD(stop, 1);
> -	if (prev_status.state == RTE_TIMER_RUNNING) {
> +	if (prev_status.state == RTE_TIMER_RUNNING &&
> +	    lcore_id < RTE_MAX_LCORE) {

if (lcore_id != LCORE_ID_ANY) ?


>   		priv_timer[lcore_id].updated = 1;
>   	}
>
> @@ -499,6 +517,10 @@ void rte_timer_manage(void)
>   	uint64_t cur_time;
>   	int i, ret;
>
> +	/* timer manager only runs on EAL thread */
> +	if (lcore_id >= RTE_MAX_LCORE)
> +		return;
> +

Maybe an assert is more visible here. Else, if someone calls
rte_timer_manage() from a non-EAL core, it will just exit
silently.

Maybe adding a comment in rte_timer.h saying that this function
must be called from an EAL core would also help.



Regards,
Olivier

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 17/17] timer: add support to non-EAL thread
  2015-02-10 17:45           ` Olivier MATZ
@ 2015-02-11  6:25             ` Liang, Cunming
  2015-02-11 17:21               ` Olivier MATZ
  0 siblings, 1 reply; 253+ messages in thread
From: Liang, Cunming @ 2015-02-11  6:25 UTC (permalink / raw)
  To: Olivier MATZ, dev



> -----Original Message-----
> From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> Sent: Wednesday, February 11, 2015 1:45 AM
> To: Liang, Cunming; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4 17/17] timer: add support to non-EAL thread
> 
> Hi,
> 
> On 02/02/2015 03:02 AM, Cunming Liang wrote:
> > Allow to setup timers only for EAL (lcore) threads (__lcore_id <
> MAX_LCORE_ID).
> > E.g. – dynamically created thread will be able to reset/stop timer for lcore
> thread,
> > but it will be not allowed to setup timer for itself or another non-lcore thread.
> > rte_timer_manage() for non-lcore thread would simply do nothing and return
> straightway.
> >
> > Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> > ---
> >   lib/librte_timer/rte_timer.c | 40 +++++++++++++++++++++++++++++++------
> ---
> >   lib/librte_timer/rte_timer.h |  2 +-
> >   2 files changed, 32 insertions(+), 10 deletions(-)
> >
> > diff --git a/lib/librte_timer/rte_timer.c b/lib/librte_timer/rte_timer.c
> > index 269a992..601c159 100644
> > --- a/lib/librte_timer/rte_timer.c
> > +++ b/lib/librte_timer/rte_timer.c
> > @@ -79,9 +79,10 @@ static struct priv_timer priv_timer[RTE_MAX_LCORE];
> >
> >   /* when debug is enabled, store some statistics */
> >   #ifdef RTE_LIBRTE_TIMER_DEBUG
> > -#define __TIMER_STAT_ADD(name, n) do {				\
> > -		unsigned __lcore_id = rte_lcore_id();		\
> > -		priv_timer[__lcore_id].stats.name += (n);	\
> > +#define __TIMER_STAT_ADD(name, n) do {					\
> > +		unsigned __lcore_id = rte_lcore_id();			\
> > +		if (__lcore_id < RTE_MAX_LCORE)				\
> > +			priv_timer[__lcore_id].stats.name += (n);	\
> >   	} while(0)
> >   #else
> >   #define __TIMER_STAT_ADD(name, n) do {} while(0)
> > @@ -127,15 +128,26 @@ timer_set_config_state(struct rte_timer *tim,
> >   	unsigned lcore_id;
> >
> >   	lcore_id = rte_lcore_id();
> > +	if (lcore_id >= RTE_MAX_LCORE)
> > +		lcore_id = LCORE_ID_ANY;
> 
> Is this still valid?
> In my understanding, rte_lcore_id() was returning the core id or
> LCORE_ID_ANY if it's a non-EAL thread.
[LCM] It's a nice to have one, in case lcore_id got an invalid number.
We can add a assert to replace these two line.
> 
> >
> >   	/* wait that the timer is in correct status before update,
> >   	 * and mark it as being configured */
> >   	while (success == 0) {
> >   		prev_status.u32 = tim->status.u32;
> >
> > +		/*
> > +		 * prevent race condition of non-EAL threads
> > +		 * to update the timer. When 'owner == LCORE_ID_ANY',
> > +		 * it means updated by a non-EAL thread.
> > +		 */
> > +		if (lcore_id == (unsigned)LCORE_ID_ANY &&
> > +		    (uint16_t)lcore_id == prev_status.owner)
> > +			return -1;
> > +
> 
> Are you sure this is required?
> 
> I think prev_status.owner can be LCORE_ID_ANY only in config state,
> as a timer cannot be scheduled on a non-EAL thread. And there is
> already a test that returns -1 if state is CONFIG.
[LCM] Good point, whenever prev_status.owner == LCORE_ID_ANY, the prev_status.state must be RTE_TIMER_CONFIG.
Make sense to me to remove the condition check. 
> 
> 
> >   		/* timer is running on another core, exit */
> >   		if (prev_status.state == RTE_TIMER_RUNNING &&
> > -		    (unsigned)prev_status.owner != lcore_id)
> > +		    prev_status.owner != (uint16_t)lcore_id)
> >   			return -1;
> >
> >   		/* timer is being configured on another core */
> > @@ -366,9 +378,13 @@ __rte_timer_reset(struct rte_timer *tim, uint64_t
> expire,
> >
> >   	/* round robin for tim_lcore */
> >   	if (tim_lcore == (unsigned)LCORE_ID_ANY) {
> > -		tim_lcore = rte_get_next_lcore(priv_timer[lcore_id].prev_lcore,
> > -					       0, 1);
> > -		priv_timer[lcore_id].prev_lcore = tim_lcore;
> > +		if (lcore_id < RTE_MAX_LCORE) {
> 
> if (lcore_id != LCORE_ID_ANY) ?
[LCM] Accept.
> 
> 
> > +			tim_lcore = rte_get_next_lcore(
> > +				priv_timer[lcore_id].prev_lcore,
> > +				0, 1);
> > +			priv_timer[lcore_id].prev_lcore = tim_lcore;
> > +		} else
> > +			tim_lcore = rte_get_next_lcore(LCORE_ID_ANY, 0, 1);
> 
> I think the following line:
> tim_lcore = rte_get_next_lcore(LCORE_ID_ANY, 0, 1);
> Will return the first enabled core.
> 
> Maybe using rte_get_master_lcore() is clearer?
[LCM] It doesn't expect must to be a master lcore.
Any available lcore is fine, so I think make sense to just use the first enabled core.
> 
> 
> 
> >   	}
> >
> >   	/* wait that the timer is in correct status before update,
> > @@ -378,7 +394,8 @@ __rte_timer_reset(struct rte_timer *tim, uint64_t
> expire,
> >   		return -1;
> >
> >   	__TIMER_STAT_ADD(reset, 1);
> > -	if (prev_status.state == RTE_TIMER_RUNNING) {
> > +	if (prev_status.state == RTE_TIMER_RUNNING &&
> > +	    lcore_id < RTE_MAX_LCORE) {
> 
> if (lcore_id != LCORE_ID_ANY) ?
> 
> 
> >   		priv_timer[lcore_id].updated = 1;
> >   	}
> >
> > @@ -455,7 +472,8 @@ rte_timer_stop(struct rte_timer *tim)
> >   		return -1;
> >
> >   	__TIMER_STAT_ADD(stop, 1);
> > -	if (prev_status.state == RTE_TIMER_RUNNING) {
> > +	if (prev_status.state == RTE_TIMER_RUNNING &&
> > +	    lcore_id < RTE_MAX_LCORE) {
> 
> if (lcore_id != LCORE_ID_ANY) ?
> 
> 
> >   		priv_timer[lcore_id].updated = 1;
> >   	}
> >
> > @@ -499,6 +517,10 @@ void rte_timer_manage(void)
> >   	uint64_t cur_time;
> >   	int i, ret;
> >
> > +	/* timer manager only runs on EAL thread */
> > +	if (lcore_id >= RTE_MAX_LCORE)
> > +		return;
> > +
> 
> Maybe an assert is more visible here. Else, if someone calls
> rte_timer_manage() from a non-EAL core, it will just exit
> silently.
> 
> Maybe adding a comment in rte_timer.h saying that this function
> must be called from an EAL core would also help.
[LCM] accept. 
> 
> 
> 
> Regards,
> Olivier


^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 17/17] timer: add support to non-EAL thread
  2015-02-11  6:25             ` Liang, Cunming
@ 2015-02-11 17:21               ` Olivier MATZ
  2015-02-12  0:29                 ` Liang, Cunming
  0 siblings, 1 reply; 253+ messages in thread
From: Olivier MATZ @ 2015-02-11 17:21 UTC (permalink / raw)
  To: Liang, Cunming, dev

Hi,

On 02/11/2015 07:25 AM, Liang, Cunming wrote:
>>> +			tim_lcore = rte_get_next_lcore(
>>> +				priv_timer[lcore_id].prev_lcore,
>>> +				0, 1);
>>> +			priv_timer[lcore_id].prev_lcore = tim_lcore;
>>> +		} else
>>> +			tim_lcore = rte_get_next_lcore(LCORE_ID_ANY, 0, 1);
>>
>> I think the following line:
>> tim_lcore = rte_get_next_lcore(LCORE_ID_ANY, 0, 1);
>> Will return the first enabled core.
>>
>> Maybe using rte_get_master_lcore() is clearer?
> [LCM] It doesn't expect must to be a master lcore.
> Any available lcore is fine, so I think make sense to just use the first enabled core.

Yes I agree it does not need to be the master lcore, but until recently
the definition of the master lcore was "the first enabled core".

I was thinking rte_get_master_lcore() is easier to understand
that rte_get_next_lcore(LCORE_ID_ANY, 0, 1). If you still prefer
to keep the second one, can you add a comment saying something like
"non-EAL thread do not run rte_timer_manage(), so schedule the timer
on the first enabled lcore"?

Thanks,
Olivier

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v4 17/17] timer: add support to non-EAL thread
  2015-02-11 17:21               ` Olivier MATZ
@ 2015-02-12  0:29                 ` Liang, Cunming
  0 siblings, 0 replies; 253+ messages in thread
From: Liang, Cunming @ 2015-02-12  0:29 UTC (permalink / raw)
  To: Olivier MATZ, dev

Hi,

> -----Original Message-----
> From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> Sent: Thursday, February 12, 2015 1:22 AM
> To: Liang, Cunming; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4 17/17] timer: add support to non-EAL thread
> 
> Hi,
> 
> On 02/11/2015 07:25 AM, Liang, Cunming wrote:
> >>> +			tim_lcore = rte_get_next_lcore(
> >>> +				priv_timer[lcore_id].prev_lcore,
> >>> +				0, 1);
> >>> +			priv_timer[lcore_id].prev_lcore = tim_lcore;
> >>> +		} else
> >>> +			tim_lcore = rte_get_next_lcore(LCORE_ID_ANY, 0, 1);
> >>
> >> I think the following line:
> >> tim_lcore = rte_get_next_lcore(LCORE_ID_ANY, 0, 1);
> >> Will return the first enabled core.
> >>
> >> Maybe using rte_get_master_lcore() is clearer?
> > [LCM] It doesn't expect must to be a master lcore.
> > Any available lcore is fine, so I think make sense to just use the first enabled
> core.
> 
> Yes I agree it does not need to be the master lcore, but until recently
> the definition of the master lcore was "the first enabled core".
> 
> I was thinking rte_get_master_lcore() is easier to understand
> that rte_get_next_lcore(LCORE_ID_ANY, 0, 1). If you still prefer
> to keep the second one, can you add a comment saying something like
> "non-EAL thread do not run rte_timer_manage(), so schedule the timer
> on the first enabled lcore"?
[LCM] That makes sense, will add it. Thanks.
> 
> Thanks,
> Olivier


^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v5 00/19] support multi-pthread per core
  2015-02-02  2:02       ` [dpdk-dev] [PATCH v4 00/17] support multi-pthread per core Cunming Liang
                           ` (17 preceding siblings ...)
  2015-02-06 15:47         ` [dpdk-dev] [PATCH v4 00/17] support multi-pthread per core Olivier MATZ
@ 2015-02-12  8:16         ` Cunming Liang
  2015-02-12  8:16           ` [dpdk-dev] [PATCH v5 01/19] eal: add cpuset into per EAL thread lcore_config Cunming Liang
                             ` (19 more replies)
  18 siblings, 20 replies; 253+ messages in thread
From: Cunming Liang @ 2015-02-12  8:16 UTC (permalink / raw)
  To: dev

v5 changes:
  reorder some patch and split into addtional two patches
  rte_thread_get_affinity() return type change to avoid
  add RTE_RING_PAUSE_REP into config and by default turn off

v4 changes:
  new patch fixing strnlen() invalid return in 32bit icc [03/17]
  update and add more comments on sched_yield() [16/17]

v3 changes:
  new patch adding sched_yield() in rte_ring to avoid long spin [16/17]

v2 changes:
  add '<number>-<number>' support for EAL option '--lcores' [02/17]

The patch series contain the enhancements of EAL and fixes for libraries
to run multi-pthreads(either EAL or non-EAL thread) per physical core.
Two major changes list as below:
- Extend the core affinity of each EAL thread to 1:n.
  Each lcore stands for a EAL thread rather than a logical core.
  The change adds new EAL option to allow static lcore to cpuset assginment.
  Then a lcore(EAL thread) affinity to a cpuset, original 1:1 mapping is the special case.
- Fix the libraries to allow running on any non-EAL thread.
  It fix the gaps running libraries in non-EAL thread(dynamic created by user).
  Each fix libraries take care the case of rte_lcore_id() >= RTE_MAX_LCORE.

Thanks a million for the comments from Konstantin, Bruce, Mirek and Stephen in RFC review.

*** BLURB HERE ***

Cunming Liang (19):
  eal: add cpuset into per EAL thread lcore_config
  eal: fix PAGE_SIZE redefine complaint on freebsd
  eal: new eal option '--lcores' for cpu assignment
  eal: fix wrong strnlen() return value in 32bit icc
  eal: add support parsing socket_id from cpuset
  eal: new TLS definition and API declaration
  eal: add eal_common_thread.c for common thread API
  eal: standardize init sequence between linux and bsd
  eal: add rte_gettid() to acquire unique system tid
  eal: apply affinity of EAL thread by assigned cpuset
  enic: fix re-define freebsd compile complain
  malloc: fix the issue of SOCKET_ID_ANY
  log: fix the gap to support non-EAL thread
  eal: set _lcore_id and _socket_id to (-1) by default
  eal: fix recursive spinlock in non-EAL thraed
  mempool: add support to non-EAL thread
  ring: add support to non-EAL thread
  ring: add sched_yield to avoid spin forever
  timer: add support to non-EAL thread

 config/common_bsdapp                               |   1 +
 config/common_linuxapp                             |   1 +
 lib/librte_eal/bsdapp/eal/Makefile                 |   1 +
 lib/librte_eal/bsdapp/eal/eal.c                    |  14 +-
 lib/librte_eal/bsdapp/eal/eal_lcore.c              |  14 +
 lib/librte_eal/bsdapp/eal/eal_memory.c             |   8 +-
 lib/librte_eal/bsdapp/eal/eal_thread.c             |  77 ++----
 lib/librte_eal/common/eal_common_log.c             |  17 +-
 lib/librte_eal/common/eal_common_options.c         | 308 ++++++++++++++++++++-
 lib/librte_eal/common/eal_common_thread.c          | 150 ++++++++++
 lib/librte_eal/common/eal_options.h                |   2 +
 lib/librte_eal/common/eal_thread.h                 |  47 ++++
 .../common/include/generic/rte_spinlock.h          |   4 +-
 lib/librte_eal/common/include/rte_eal.h            |  27 ++
 lib/librte_eal/common/include/rte_lcore.h          |  40 ++-
 lib/librte_eal/common/include/rte_log.h            |   5 +
 lib/librte_eal/linuxapp/eal/Makefile               |   4 +
 lib/librte_eal/linuxapp/eal/eal.c                  |   8 +-
 lib/librte_eal/linuxapp/eal/eal_lcore.c            |  15 +-
 lib/librte_eal/linuxapp/eal/eal_thread.c           |  77 ++----
 lib/librte_malloc/malloc_heap.h                    |   7 +-
 lib/librte_mempool/rte_mempool.h                   |  18 +-
 lib/librte_pmd_enic/enic.h                         |   4 +-
 lib/librte_pmd_enic/enic_compat.h                  |   2 +-
 lib/librte_pmd_enic/vnic/vnic_dev.c                |   6 +-
 lib/librte_ring/rte_ring.h                         |  41 ++-
 lib/librte_timer/rte_timer.c                       |  31 ++-
 lib/librte_timer/rte_timer.h                       |   4 +-
 28 files changed, 764 insertions(+), 169 deletions(-)
 create mode 100644 lib/librte_eal/common/eal_common_thread.c

-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v5 01/19] eal: add cpuset into per EAL thread lcore_config
  2015-02-12  8:16         ` [dpdk-dev] [PATCH v5 00/19] " Cunming Liang
@ 2015-02-12  8:16           ` Cunming Liang
  2015-02-12  8:16           ` [dpdk-dev] [PATCH v5 02/19] eal: fix PAGE_SIZE redefine complaint on freebsd Cunming Liang
                             ` (18 subsequent siblings)
  19 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-02-12  8:16 UTC (permalink / raw)
  To: dev

The patch adds 'cpuset' into per-lcore configure 'lcore_config[]',
as the lcore no longer always 1:1 pinning with physical cpu.
The lcore now stands for a EAL thread rather than a logical cpu.

It doesn't change the default behavior of 1:1 mapping, but allows to
affinity the EAL thread to multiple cpus.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 v5 changes:
   separate eal_memory.c to the new patch

 lib/librte_eal/bsdapp/eal/eal_lcore.c     | 7 +++++++
 lib/librte_eal/common/include/rte_lcore.h | 8 ++++++++
 lib/librte_eal/linuxapp/eal/Makefile      | 1 +
 lib/librte_eal/linuxapp/eal/eal_lcore.c   | 8 ++++++++
 4 files changed, 24 insertions(+)

diff --git a/lib/librte_eal/bsdapp/eal/eal_lcore.c b/lib/librte_eal/bsdapp/eal/eal_lcore.c
index 662f024..72f8ac2 100644
--- a/lib/librte_eal/bsdapp/eal/eal_lcore.c
+++ b/lib/librte_eal/bsdapp/eal/eal_lcore.c
@@ -76,11 +76,18 @@ rte_eal_cpu_init(void)
 	 * ones and enable them by default.
 	 */
 	for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
+		/* init cpuset for per lcore config */
+		CPU_ZERO(&lcore_config[lcore_id].cpuset);
+
 		lcore_config[lcore_id].detected = (lcore_id < ncpus);
 		if (lcore_config[lcore_id].detected == 0) {
 			config->lcore_role[lcore_id] = ROLE_OFF;
 			continue;
 		}
+
+		/* By default, lcore 1:1 map to cpu id */
+		CPU_SET(lcore_id, &lcore_config[lcore_id].cpuset);
+
 		/* By default, each detected core is enabled */
 		config->lcore_role[lcore_id] = ROLE_RTE;
 		lcore_config[lcore_id].core_id = cpu_core_id(lcore_id);
diff --git a/lib/librte_eal/common/include/rte_lcore.h b/lib/librte_eal/common/include/rte_lcore.h
index 49b2c03..4c7d6bb 100644
--- a/lib/librte_eal/common/include/rte_lcore.h
+++ b/lib/librte_eal/common/include/rte_lcore.h
@@ -50,6 +50,13 @@ extern "C" {
 
 #define LCORE_ID_ANY -1    /**< Any lcore. */
 
+#if defined(__linux__)
+	typedef	cpu_set_t rte_cpuset_t;
+#elif defined(__FreeBSD__)
+#include <pthread_np.h>
+	typedef cpuset_t rte_cpuset_t;
+#endif
+
 /**
  * Structure storing internal configuration (per-lcore)
  */
@@ -65,6 +72,7 @@ struct lcore_config {
 	unsigned socket_id;        /**< physical socket id for this lcore */
 	unsigned core_id;          /**< core number on socket for this lcore */
 	int core_index;            /**< relative index, starting from 0 */
+	rte_cpuset_t cpuset;       /**< cpu set which the lcore affinity to */
 };
 
 /**
diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile
index e117cec..1b6c484 100644
--- a/lib/librte_eal/linuxapp/eal/Makefile
+++ b/lib/librte_eal/linuxapp/eal/Makefile
@@ -91,6 +91,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_common_dev.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_common_options.c
 
 CFLAGS_eal.o := -D_GNU_SOURCE
+CFLAGS_eal_lcore.o := -D_GNU_SOURCE
 CFLAGS_eal_thread.o := -D_GNU_SOURCE
 CFLAGS_eal_log.o := -D_GNU_SOURCE
 CFLAGS_eal_common_log.o := -D_GNU_SOURCE
diff --git a/lib/librte_eal/linuxapp/eal/eal_lcore.c b/lib/librte_eal/linuxapp/eal/eal_lcore.c
index c67e0e6..29615f8 100644
--- a/lib/librte_eal/linuxapp/eal/eal_lcore.c
+++ b/lib/librte_eal/linuxapp/eal/eal_lcore.c
@@ -158,11 +158,19 @@ rte_eal_cpu_init(void)
 	 * ones and enable them by default.
 	 */
 	for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
+		/* init cpuset for per lcore config */
+		CPU_ZERO(&lcore_config[lcore_id].cpuset);
+
+		/* in 1:1 mapping, record related cpu detected state */
 		lcore_config[lcore_id].detected = cpu_detected(lcore_id);
 		if (lcore_config[lcore_id].detected == 0) {
 			config->lcore_role[lcore_id] = ROLE_OFF;
 			continue;
 		}
+
+		/* By default, lcore 1:1 map to cpu id */
+		CPU_SET(lcore_id, &lcore_config[lcore_id].cpuset);
+
 		/* By default, each detected core is enabled */
 		config->lcore_role[lcore_id] = ROLE_RTE;
 		lcore_config[lcore_id].core_id = cpu_core_id(lcore_id);
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v5 02/19] eal: fix PAGE_SIZE redefine complaint on freebsd
  2015-02-12  8:16         ` [dpdk-dev] [PATCH v5 00/19] " Cunming Liang
  2015-02-12  8:16           ` [dpdk-dev] [PATCH v5 01/19] eal: add cpuset into per EAL thread lcore_config Cunming Liang
@ 2015-02-12  8:16           ` Cunming Liang
  2015-02-12  8:16           ` [dpdk-dev] [PATCH v5 03/19] eal: new eal option '--lcores' for cpu assignment Cunming Liang
                             ` (17 subsequent siblings)
  19 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-02-12  8:16 UTC (permalink / raw)
  To: dev


Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/bsdapp/eal/eal_memory.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal_memory.c b/lib/librte_eal/bsdapp/eal/eal_memory.c
index 65ee87d..33ebd0f 100644
--- a/lib/librte_eal/bsdapp/eal/eal_memory.c
+++ b/lib/librte_eal/bsdapp/eal/eal_memory.c
@@ -45,7 +45,7 @@
 #include "eal_internal_cfg.h"
 #include "eal_filesystem.h"
 
-#define PAGE_SIZE (sysconf(_SC_PAGESIZE))
+#define EAL_PAGE_SIZE (sysconf(_SC_PAGESIZE))
 
 /*
  * Get physical address of any mapped virtual address in the current process.
@@ -93,7 +93,8 @@ rte_eal_contigmem_init(void)
 			char physaddr_str[64];
 
 			addr = mmap(NULL, hpi->hugepage_sz, PROT_READ|PROT_WRITE,
-					MAP_SHARED, hpi->lock_descriptor, j * PAGE_SIZE);
+				    MAP_SHARED, hpi->lock_descriptor,
+				    j * EAL_PAGE_SIZE);
 			if (addr == MAP_FAILED) {
 				RTE_LOG(ERR, EAL, "Failed to mmap buffer %u from %s\n",
 						j, hpi->hugedir);
@@ -167,7 +168,8 @@ rte_eal_contigmem_attach(void)
 		struct rte_memseg *seg = &mcfg->memseg[i];
 
 		addr = mmap(seg->addr, hpi->hugepage_sz, PROT_READ|PROT_WRITE,
-				MAP_SHARED|MAP_FIXED, fd_hugepage, i * PAGE_SIZE);
+			    MAP_SHARED|MAP_FIXED, fd_hugepage,
+			    i * EAL_PAGE_SIZE);
 		if (addr == MAP_FAILED || addr != seg->addr) {
 			RTE_LOG(ERR, EAL, "Failed to mmap buffer %u from %s\n",
 				i, hpi->hugedir);
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v5 03/19] eal: new eal option '--lcores' for cpu assignment
  2015-02-12  8:16         ` [dpdk-dev] [PATCH v5 00/19] " Cunming Liang
  2015-02-12  8:16           ` [dpdk-dev] [PATCH v5 01/19] eal: add cpuset into per EAL thread lcore_config Cunming Liang
  2015-02-12  8:16           ` [dpdk-dev] [PATCH v5 02/19] eal: fix PAGE_SIZE redefine complaint on freebsd Cunming Liang
@ 2015-02-12  8:16           ` Cunming Liang
  2015-02-12  8:16           ` [dpdk-dev] [PATCH v5 04/19] eal: fix wrong strnlen() return value in 32bit icc Cunming Liang
                             ` (16 subsequent siblings)
  19 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-02-12  8:16 UTC (permalink / raw)
  To: dev

It supports one new eal long option '--lcores' for EAL thread cpuset assignment.

The format pattern:
	--lcores='lcores[@cpus]<,lcores[@cpus]>'
lcores, cpus could be a single digit/range or a group.
'(' and ')' are necessary if it's a group.
If not supply '@cpus', the value of cpus uses the same as lcores.

e.g. '1,2@(5-7),(3-5)@(0,2),(0,6),7-8' means starting 9 EAL thread as below
  lcore 0 runs on cpuset 0x41 (cpu 0,6)
  lcore 1 runs on cpuset 0x2 (cpu 1)
  lcore 2 runs on cpuset 0xe0 (cpu 5,6,7)
  lcore 3,4,5 runs on cpuset 0x5 (cpu 0,2)
  lcore 6 runs on cpuset 0x41 (cpu 0,6)
  lcore 7 runs on cpuset 0x80 (cpu 7)
  lcore 8 runs on cpuset 0x100 (cpu 8)

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 v5 changes:
   add more comments for eal_parse_set
   fix some typo
   remove inline prefix from convert_to_cpuset()
   fix a bug introduced on v2 which broke case '(0,6)'

 v2 changes:
   add '<number>-<number>' support for EAL option '--lcores'

 lib/librte_eal/common/eal_common_options.c | 304 ++++++++++++++++++++++++++++-
 lib/librte_eal/common/eal_options.h        |   2 +
 lib/librte_eal/linuxapp/eal/Makefile       |   1 +
 3 files changed, 303 insertions(+), 4 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_options.c b/lib/librte_eal/common/eal_common_options.c
index 67e02dc..178e303 100644
--- a/lib/librte_eal/common/eal_common_options.c
+++ b/lib/librte_eal/common/eal_common_options.c
@@ -45,6 +45,7 @@
 #include <rte_lcore.h>
 #include <rte_version.h>
 #include <rte_devargs.h>
+#include <rte_memcpy.h>
 
 #include "eal_internal_cfg.h"
 #include "eal_options.h"
@@ -85,6 +86,7 @@ eal_long_options[] = {
 	{OPT_XEN_DOM0, 0, 0, OPT_XEN_DOM0_NUM},
 	{OPT_CREATE_UIO_DEV, 1, NULL, OPT_CREATE_UIO_DEV_NUM},
 	{OPT_VFIO_INTR, 1, NULL, OPT_VFIO_INTR_NUM},
+	{OPT_LCORES, 1, 0, OPT_LCORES_NUM},
 	{0, 0, 0, 0}
 };
 
@@ -255,9 +257,11 @@ eal_parse_corelist(const char *corelist)
 			if (min == RTE_MAX_LCORE)
 				min = idx;
 			for (idx = min; idx <= max; idx++) {
-				cfg->lcore_role[idx] = ROLE_RTE;
-				lcore_config[idx].core_index = count;
-				count++;
+				if (cfg->lcore_role[idx] != ROLE_RTE) {
+					cfg->lcore_role[idx] = ROLE_RTE;
+					lcore_config[idx].core_index = count;
+					count++;
+				}
 			}
 			min = RTE_MAX_LCORE;
 		} else
@@ -292,6 +296,283 @@ eal_parse_master_lcore(const char *arg)
 	return 0;
 }
 
+/*
+ * Parse elem, the elem could be single number/range or '(' ')' group
+ * 1) A single number elem, it's just a simple digit. e.g. 9
+ * 2) A single range elem, two digits with a '-' between. e.g. 2-6
+ * 3) A group elem, combines multiple 1) or 2) with '( )'. e.g (0,2-4,6)
+ *    Within group elem, '-' used for a range separator;
+ *                       ',' used for a single number.
+ */
+static int
+eal_parse_set(const char *input, uint16_t set[], unsigned num)
+{
+	unsigned idx;
+	const char *str = input;
+	char *end = NULL;
+	unsigned min, max;
+
+	memset(set, 0, num * sizeof(uint16_t));
+
+	while (isblank(*str))
+		str++;
+
+	/* only digit or left bracket is qualify for start point */
+	if ((!isdigit(*str) && *str != '(') || *str == '\0')
+		return -1;
+
+	/* process single number or single range of number */
+	if (*str != '(') {
+		errno = 0;
+		idx = strtoul(str, &end, 10);
+		if (errno || end == NULL || idx >= num)
+			return -1;
+		else {
+			while (isblank(*end))
+				end++;
+
+			min = idx;
+			max = idx;
+			if (*end == '-') {
+				/* process single <number>-<number> */
+				end++;
+				while (isblank(*end))
+					end++;
+				if (!isdigit(*end))
+					return -1;
+
+				errno = 0;
+				idx = strtoul(end, &end, 10);
+				if (errno || end == NULL || idx >= num)
+					return -1;
+				max = idx;
+				while (isblank(*end))
+					end++;
+				if (*end != ',' && *end != '\0')
+					return -1;
+			}
+
+			if (*end != ',' && *end != '\0' &&
+			    *end != '@')
+				return -1;
+
+			for (idx = RTE_MIN(min, max);
+			     idx <= RTE_MAX(min, max); idx++)
+				set[idx] = 1;
+
+			return end - input;
+		}
+	}
+
+	/* process set within bracket */
+	str++;
+	while (isblank(*str))
+		str++;
+	if (*str == '\0')
+		return -1;
+
+	min = RTE_MAX_LCORE;
+	do {
+
+		/* go ahead to the first digit */
+		while (isblank(*str))
+			str++;
+		if (!isdigit(*str))
+			return -1;
+
+		/* get the digit value */
+		errno = 0;
+		idx = strtoul(str, &end, 10);
+		if (errno || end == NULL || idx >= num)
+			return -1;
+
+		/* go ahead to separator '-',',' and ')' */
+		while (isblank(*end))
+			end++;
+		if (*end == '-') {
+			if (min == RTE_MAX_LCORE)
+				min = idx;
+			else /* avoid continuous '-' */
+				return -1;
+		} else if ((*end == ',') || (*end == ')')) {
+			max = idx;
+			if (min == RTE_MAX_LCORE)
+				min = idx;
+			for (idx = RTE_MIN(min, max);
+			     idx <= RTE_MAX(min, max); idx++)
+				set[idx] = 1;
+
+			min = RTE_MAX_LCORE;
+		} else
+			return -1;
+
+		str = end + 1;
+	} while (*end != '\0' && *end != ')');
+
+	return str - input;
+}
+
+/* convert from set array to cpuset bitmap */
+static int
+convert_to_cpuset(rte_cpuset_t *cpusetp,
+	      uint16_t *set, unsigned num)
+{
+	unsigned idx;
+
+	CPU_ZERO(cpusetp);
+
+	for (idx = 0; idx < num; idx++) {
+		if (!set[idx])
+			continue;
+
+		if (!lcore_config[idx].detected) {
+			RTE_LOG(ERR, EAL, "core %u "
+				"unavailable\n", idx);
+			return -1;
+		}
+
+		CPU_SET(idx, cpusetp);
+	}
+
+	return 0;
+}
+
+/*
+ * The format pattern: --lcores='lcores[@cpus]<,lcores[@cpus]>'
+ * lcores, cpus could be a single digit/range or a group.
+ * '(' and ')' are necessary if it's a group.
+ * If not supply '@cpus', the value of cpus uses the same as lcores.
+ * e.g. '1,2@(5-7),(3-5)@(0,2),(0,6),7-8' means start 9 EAL thread as below
+ *   lcore 0 runs on cpuset 0x41 (cpu 0,6)
+ *   lcore 1 runs on cpuset 0x2 (cpu 1)
+ *   lcore 2 runs on cpuset 0xe0 (cpu 5,6,7)
+ *   lcore 3,4,5 runs on cpuset 0x5 (cpu 0,2)
+ *   lcore 6 runs on cpuset 0x41 (cpu 0,6)
+ *   lcore 7 runs on cpuset 0x80 (cpu 7)
+ *   lcore 8 runs on cpuset 0x100 (cpu 8)
+ */
+static int
+eal_parse_lcores(const char *lcores)
+{
+	struct rte_config *cfg = rte_eal_get_configuration();
+	static uint16_t set[RTE_MAX_LCORE];
+	unsigned idx = 0;
+	int i;
+	unsigned count = 0;
+	const char *lcore_start = NULL;
+	const char *end = NULL;
+	int offset;
+	rte_cpuset_t cpuset;
+	int lflags = 0;
+	int ret = -1;
+
+	if (lcores == NULL)
+		return -1;
+
+	/* Remove all blank characters ahead and after */
+	while (isblank(*lcores))
+		lcores++;
+	i = strnlen(lcores, sysconf(_SC_ARG_MAX));
+	while ((i > 0) && isblank(lcores[i - 1]))
+		i--;
+
+	CPU_ZERO(&cpuset);
+
+	/* Reset lcore config */
+	for (idx = 0; idx < RTE_MAX_LCORE; idx++) {
+		cfg->lcore_role[idx] = ROLE_OFF;
+		lcore_config[idx].core_index = -1;
+		CPU_ZERO(&lcore_config[idx].cpuset);
+	}
+
+	/* Get list of cores */
+	do {
+		while (isblank(*lcores))
+			lcores++;
+		if (*lcores == '\0')
+			goto err;
+
+		/* record lcore_set start point */
+		lcore_start = lcores;
+
+		/* go across a complete bracket */
+		if (*lcore_start == '(') {
+			lcores += strcspn(lcores, ")");
+			if (*lcores++ == '\0')
+				goto err;
+		}
+
+		/* scan the separator '@', ','(next) or '\0'(finish) */
+		lcores += strcspn(lcores, "@,");
+
+		if (*lcores == '@') {
+			/* explict assign cpu_set */
+			offset = eal_parse_set(lcores + 1, set, RTE_DIM(set));
+			if (offset < 0)
+				goto err;
+
+			/* prepare cpu_set and update the end cursor */
+			if (0 > convert_to_cpuset(&cpuset,
+						  set, RTE_DIM(set)))
+				goto err;
+			end = lcores + 1 + offset;
+		} else { /* ',' or '\0' */
+			/* haven't given cpu_set, current loop done */
+			end = lcores;
+
+			/* go back to check <number>-<number> */
+			offset = strcspn(lcore_start, "(-");
+			if (offset < (end - lcore_start) &&
+			    *(lcore_start + offset) != '(')
+				lflags = 1;
+		}
+
+		if (*end != ',' && *end != '\0')
+			goto err;
+
+		/* parse lcore_set from start point */
+		if (0 > eal_parse_set(lcore_start, set, RTE_DIM(set)))
+			goto err;
+
+		/* without '@', by default using lcore_set as cpu_set */
+		if (*lcores != '@' &&
+		    0 > convert_to_cpuset(&cpuset, set, RTE_DIM(set)))
+			goto err;
+
+		/* start to update lcore_set */
+		for (idx = 0; idx < RTE_MAX_LCORE; idx++) {
+			if (!set[idx])
+				continue;
+
+			if (cfg->lcore_role[idx] != ROLE_RTE) {
+				lcore_config[idx].core_index = count;
+				cfg->lcore_role[idx] = ROLE_RTE;
+				count++;
+			}
+
+			if (lflags) {
+				CPU_ZERO(&cpuset);
+				CPU_SET(idx, &cpuset);
+			}
+			rte_memcpy(&lcore_config[idx].cpuset, &cpuset,
+				   sizeof(rte_cpuset_t));
+		}
+
+		lcores = end + 1;
+	} while (*end != '\0');
+
+	if (count == 0)
+		goto err;
+
+	cfg->lcore_count = count;
+	lcores_parsed = 1;
+	ret = 0;
+
+err:
+
+	return ret;
+}
+
 static int
 eal_parse_syslog(const char *facility, struct internal_config *conf)
 {
@@ -492,6 +773,13 @@ eal_parse_common_option(int opt, const char *optarg,
 		conf->log_level = log;
 		break;
 	}
+	case OPT_LCORES_NUM:
+		if (eal_parse_lcores(optarg) < 0) {
+			RTE_LOG(ERR, EAL, "invalid parameter for --"
+				OPT_LCORES "\n");
+			return -1;
+		}
+		break;
 
 	/* don't know what to do, leave this to caller */
 	default:
@@ -530,7 +818,7 @@ eal_check_common_options(struct internal_config *internal_cfg)
 
 	if (!lcores_parsed) {
 		RTE_LOG(ERR, EAL, "CPU cores must be enabled with options "
-			"-c or -l\n");
+			"-c, -l or --lcores\n");
 		return -1;
 	}
 	if (cfg->lcore_role[cfg->master_lcore] != ROLE_RTE) {
@@ -586,6 +874,14 @@ eal_common_usage(void)
 	       "                 The argument format is <c1>[-c2][,c3[-c4],...]\n"
 	       "                 where c1, c2, etc are core indexes between 0 and %d\n"
 	       "  --"OPT_MASTER_LCORE" ID: Core ID that is used as master\n"
+	       "  --"OPT_LCORES" MAP: maps between lcore_set to phys_cpu_set\n"
+	       "                 The argument format is\n"
+	       "                       'lcores[@cpus]<,lcores[@cpus],...>'\n"
+	       "                 lcores and cpus list are grouped by '(' and ')'\n"
+	       "                 Within the group, '-' is used for range separator,\n"
+	       "                 ',' is used for single number separator.\n"
+	       "                 '( )' can be omitted for single element group, '@' \n"
+	       "                 can be omitted if cpus and lcores has the same value\n"
 	       "  -n NUM       : Number of memory channels\n"
 	       "  -v           : Display version information on startup\n"
 	       "  -m MB        : memory to allocate (see also --"OPT_SOCKET_MEM")\n"
diff --git a/lib/librte_eal/common/eal_options.h b/lib/librte_eal/common/eal_options.h
index e476f8d..a1cc59f 100644
--- a/lib/librte_eal/common/eal_options.h
+++ b/lib/librte_eal/common/eal_options.h
@@ -77,6 +77,8 @@ enum {
 	OPT_CREATE_UIO_DEV_NUM,
 #define OPT_VFIO_INTR    "vfio-intr"
 	OPT_VFIO_INTR_NUM,
+#define OPT_LCORES "lcores"
+	OPT_LCORES_NUM,
 	OPT_LONG_MAX_NUM
 };
 
diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile
index 1b6c484..4da9d27 100644
--- a/lib/librte_eal/linuxapp/eal/Makefile
+++ b/lib/librte_eal/linuxapp/eal/Makefile
@@ -99,6 +99,7 @@ CFLAGS_eal_hugepage_info.o := -D_GNU_SOURCE
 CFLAGS_eal_pci.o := -D_GNU_SOURCE
 CFLAGS_eal_pci_vfio.o := -D_GNU_SOURCE
 CFLAGS_eal_common_whitelist.o := -D_GNU_SOURCE
+CFLAGS_eal_common_options.o := -D_GNU_SOURCE
 
 # workaround for a gcc bug with noreturn attribute
 # http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v5 04/19] eal: fix wrong strnlen() return value in 32bit icc
  2015-02-12  8:16         ` [dpdk-dev] [PATCH v5 00/19] " Cunming Liang
                             ` (2 preceding siblings ...)
  2015-02-12  8:16           ` [dpdk-dev] [PATCH v5 03/19] eal: new eal option '--lcores' for cpu assignment Cunming Liang
@ 2015-02-12  8:16           ` Cunming Liang
  2015-02-12  8:16           ` [dpdk-dev] [PATCH v5 05/19] eal: add support parsing socket_id from cpuset Cunming Liang
                             ` (15 subsequent siblings)
  19 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-02-12  8:16 UTC (permalink / raw)
  To: dev

The problem is that strnlen() here may return invalid value with 32bit icc.
(actually it returns it’s second parameter,e.g: sysconf(_SC_ARG_MAX)).
It starts to manifest hwen max_len parameter is > 2M and using icc –m32 –O2 (or above).

Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 v5 changes:
   using strlen instead of strnlen.

 lib/librte_eal/common/eal_common_options.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_options.c b/lib/librte_eal/common/eal_common_options.c
index 178e303..9cf2faa 100644
--- a/lib/librte_eal/common/eal_common_options.c
+++ b/lib/librte_eal/common/eal_common_options.c
@@ -167,7 +167,7 @@ eal_parse_coremask(const char *coremask)
 	if (coremask[0] == '0' && ((coremask[1] == 'x')
 		|| (coremask[1] == 'X')))
 		coremask += 2;
-	i = strnlen(coremask, PATH_MAX);
+	i = strlen(coremask);
 	while ((i > 0) && isblank(coremask[i - 1]))
 		i--;
 	if (i == 0)
@@ -227,7 +227,7 @@ eal_parse_corelist(const char *corelist)
 	/* Remove all blank characters ahead and after */
 	while (isblank(*corelist))
 		corelist++;
-	i = strnlen(corelist, sysconf(_SC_ARG_MAX));
+	i = strlen(corelist);
 	while ((i > 0) && isblank(corelist[i - 1]))
 		i--;
 
@@ -472,7 +472,7 @@ eal_parse_lcores(const char *lcores)
 	/* Remove all blank characters ahead and after */
 	while (isblank(*lcores))
 		lcores++;
-	i = strnlen(lcores, sysconf(_SC_ARG_MAX));
+	i = strlen(lcores);
 	while ((i > 0) && isblank(lcores[i - 1]))
 		i--;
 
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v5 05/19] eal: add support parsing socket_id from cpuset
  2015-02-12  8:16         ` [dpdk-dev] [PATCH v5 00/19] " Cunming Liang
                             ` (3 preceding siblings ...)
  2015-02-12  8:16           ` [dpdk-dev] [PATCH v5 04/19] eal: fix wrong strnlen() return value in 32bit icc Cunming Liang
@ 2015-02-12  8:16           ` Cunming Liang
  2015-02-12  8:16           ` [dpdk-dev] [PATCH v5 06/19] eal: new TLS definition and API declaration Cunming Liang
                             ` (14 subsequent siblings)
  19 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-02-12  8:16 UTC (permalink / raw)
  To: dev

It returns the socket_id if all cpus in the cpuset belongs
to the same NUMA node, otherwise it will return SOCKET_ID_ANY.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 v5 changes:
   expose cpu_socket_id as eal_cpu_socket_id for linuxapp
   eal_cpuset_socket_id() remove static inline and move to c file

 lib/librte_eal/bsdapp/eal/eal_lcore.c   |  7 +++++++
 lib/librte_eal/common/eal_thread.h      | 11 +++++++++++
 lib/librte_eal/linuxapp/eal/eal_lcore.c |  7 ++++---
 3 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal_lcore.c b/lib/librte_eal/bsdapp/eal/eal_lcore.c
index 72f8ac2..162fb4f 100644
--- a/lib/librte_eal/bsdapp/eal/eal_lcore.c
+++ b/lib/librte_eal/bsdapp/eal/eal_lcore.c
@@ -41,6 +41,7 @@
 #include <rte_debug.h>
 
 #include "eal_private.h"
+#include "eal_thread.h"
 
 /* No topology information available on FreeBSD including NUMA info */
 #define cpu_core_id(X) 0
@@ -112,3 +113,9 @@ rte_eal_cpu_init(void)
 
 	return 0;
 }
+
+unsigned
+eal_cpu_socket_id(__rte_unused unsigned cpu_id)
+{
+	return cpu_socket_id(cpu_id);
+}
diff --git a/lib/librte_eal/common/eal_thread.h b/lib/librte_eal/common/eal_thread.h
index b53b84d..f1ce0bd 100644
--- a/lib/librte_eal/common/eal_thread.h
+++ b/lib/librte_eal/common/eal_thread.h
@@ -50,4 +50,15 @@ __attribute__((noreturn)) void *eal_thread_loop(void *arg);
  */
 void eal_thread_init_master(unsigned lcore_id);
 
+/**
+ * Get the NUMA socket id from cpu id.
+ * This function is private to EAL.
+ *
+ * @param cpu_id
+ *   The logical process id.
+ * @return
+ *   socket_id or SOCKET_ID_ANY
+ */
+unsigned eal_cpu_socket_id(unsigned cpu_id);
+
 #endif /* EAL_THREAD_H */
diff --git a/lib/librte_eal/linuxapp/eal/eal_lcore.c b/lib/librte_eal/linuxapp/eal/eal_lcore.c
index 29615f8..ef8c433 100644
--- a/lib/librte_eal/linuxapp/eal/eal_lcore.c
+++ b/lib/librte_eal/linuxapp/eal/eal_lcore.c
@@ -45,6 +45,7 @@
 
 #include "eal_private.h"
 #include "eal_filesystem.h"
+#include "eal_thread.h"
 
 #define SYS_CPU_DIR "/sys/devices/system/cpu/cpu%u"
 #define CORE_ID_FILE "topology/core_id"
@@ -71,8 +72,8 @@ cpu_detected(unsigned lcore_id)
  * Note: physical package id != NUMA node, but we use it as a
  * fallback for kernels which don't create a nodeY link
  */
-static unsigned
-cpu_socket_id(unsigned lcore_id)
+unsigned
+eal_cpu_socket_id(unsigned lcore_id)
 {
 	const char node_prefix[] = "node";
 	const size_t prefix_len = sizeof(node_prefix) - 1;
@@ -174,7 +175,7 @@ rte_eal_cpu_init(void)
 		/* By default, each detected core is enabled */
 		config->lcore_role[lcore_id] = ROLE_RTE;
 		lcore_config[lcore_id].core_id = cpu_core_id(lcore_id);
-		lcore_config[lcore_id].socket_id = cpu_socket_id(lcore_id);
+		lcore_config[lcore_id].socket_id = eal_cpu_socket_id(lcore_id);
 		if (lcore_config[lcore_id].socket_id >= RTE_MAX_NUMA_NODES)
 #ifdef RTE_EAL_ALLOW_INV_SOCKET_ID
 			lcore_config[lcore_id].socket_id = 0;
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v5 06/19] eal: new TLS definition and API declaration
  2015-02-12  8:16         ` [dpdk-dev] [PATCH v5 00/19] " Cunming Liang
                             ` (4 preceding siblings ...)
  2015-02-12  8:16           ` [dpdk-dev] [PATCH v5 05/19] eal: add support parsing socket_id from cpuset Cunming Liang
@ 2015-02-12  8:16           ` Cunming Liang
  2015-02-12  8:16           ` [dpdk-dev] [PATCH v5 07/19] eal: add eal_common_thread.c for common thread API Cunming Liang
                             ` (13 subsequent siblings)
  19 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-02-12  8:16 UTC (permalink / raw)
  To: dev

1. add two TLS *_socket_id* and *_cpuset*
2. add two external API rte_thread_set/get_affinity
3. add one internal API eal_thread_dump_affinity

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 v5 changes:
   add comments for RTE_CPU_AFFINITY_STR_LEN
   update comments for eal_thread_dump_affinity()
   return void for rte_thread_get_affinity()
   move rte_socket_id() change to another patch

 lib/librte_eal/bsdapp/eal/eal_thread.c    |  2 ++
 lib/librte_eal/common/eal_thread.h        | 36 +++++++++++++++++++++++++++++++
 lib/librte_eal/common/include/rte_lcore.h | 26 +++++++++++++++++++++-
 lib/librte_eal/linuxapp/eal/eal_thread.c  |  2 ++
 4 files changed, 65 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c b/lib/librte_eal/bsdapp/eal/eal_thread.c
index ab05368..10220c7 100644
--- a/lib/librte_eal/bsdapp/eal/eal_thread.c
+++ b/lib/librte_eal/bsdapp/eal/eal_thread.c
@@ -56,6 +56,8 @@
 #include "eal_thread.h"
 
 RTE_DEFINE_PER_LCORE(unsigned, _lcore_id);
+RTE_DEFINE_PER_LCORE(unsigned, _socket_id);
+RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset);
 
 /*
  * Send a message to a slave lcore identified by slave_id to call a
diff --git a/lib/librte_eal/common/eal_thread.h b/lib/librte_eal/common/eal_thread.h
index f1ce0bd..e4e76b9 100644
--- a/lib/librte_eal/common/eal_thread.h
+++ b/lib/librte_eal/common/eal_thread.h
@@ -34,6 +34,8 @@
 #ifndef EAL_THREAD_H
 #define EAL_THREAD_H
 
+#include <rte_lcore.h>
+
 /**
  * basic loop of thread, called for each thread by eal_init().
  *
@@ -61,4 +63,38 @@ void eal_thread_init_master(unsigned lcore_id);
  */
 unsigned eal_cpu_socket_id(unsigned cpu_id);
 
+/**
+ * Get the NUMA socket id from cpuset.
+ * This function is private to EAL.
+ *
+ * @param cpusetp
+ *   The point to a valid cpu set.
+ * @return
+ *   socket_id or SOCKET_ID_ANY
+ */
+int eal_cpuset_socket_id(rte_cpuset_t *cpusetp);
+
+/**
+ * Default buffer size to use with eal_thread_dump_affinity()
+ */
+#define RTE_CPU_AFFINITY_STR_LEN            256
+
+/**
+ * Dump the current pthread cpuset.
+ * This function is private to EAL.
+ *
+ * Note:
+ *   If the dump size is greater than the size of given buffer,
+ *   the string will be truncated and with '\0' at the end.
+ *
+ * @param str
+ *   The string buffer the cpuset will dump to.
+ * @param size
+ *   The string buffer size.
+ * @return
+ *   0 for success, -1 if truncation happens.
+ */
+int
+eal_thread_dump_affinity(char *str, unsigned size);
+
 #endif /* EAL_THREAD_H */
diff --git a/lib/librte_eal/common/include/rte_lcore.h b/lib/librte_eal/common/include/rte_lcore.h
index 4c7d6bb..33f558e 100644
--- a/lib/librte_eal/common/include/rte_lcore.h
+++ b/lib/librte_eal/common/include/rte_lcore.h
@@ -80,7 +80,9 @@ struct lcore_config {
  */
 extern struct lcore_config lcore_config[RTE_MAX_LCORE];
 
-RTE_DECLARE_PER_LCORE(unsigned, _lcore_id); /**< Per core "core id". */
+RTE_DECLARE_PER_LCORE(unsigned, _lcore_id);  /**< Per thread "lcore id". */
+RTE_DECLARE_PER_LCORE(unsigned, _socket_id); /**< Per thread "socket id". */
+RTE_DECLARE_PER_LCORE(rte_cpuset_t, _cpuset); /**< Per thread "cpuset". */
 
 /**
  * Return the ID of the execution unit we are running on.
@@ -229,6 +231,28 @@ rte_get_next_lcore(unsigned i, int skip_master, int wrap)
 	     i<RTE_MAX_LCORE;						\
 	     i = rte_get_next_lcore(i, 1, 0))
 
+/**
+ * Set core affinity of the current thread.
+ * Support both EAL and none-EAL thread and update TLS.
+ *
+ * @param cpusetp
+ *   Point to cpu_set_t for setting current thread affinity.
+ * @return
+ *   On success, return 0; otherwise return -1;
+ */
+int rte_thread_set_affinity(rte_cpuset_t *cpusetp);
+
+/**
+ * Get core affinity of the current thread.
+ *
+ * @param cpusetp
+ *   Point to cpu_set_t for getting current thread cpu affinity.
+ *   It presumes input is not NULL, otherwise it causes panic.
+ *
+ */
+void rte_thread_get_affinity(rte_cpuset_t *cpusetp);
+
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_eal/linuxapp/eal/eal_thread.c b/lib/librte_eal/linuxapp/eal/eal_thread.c
index 80a985f..748a83a 100644
--- a/lib/librte_eal/linuxapp/eal/eal_thread.c
+++ b/lib/librte_eal/linuxapp/eal/eal_thread.c
@@ -56,6 +56,8 @@
 #include "eal_thread.h"
 
 RTE_DEFINE_PER_LCORE(unsigned, _lcore_id);
+RTE_DEFINE_PER_LCORE(unsigned, _socket_id);
+RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset);
 
 /*
  * Send a message to a slave lcore identified by slave_id to call a
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v5 07/19] eal: add eal_common_thread.c for common thread API
  2015-02-12  8:16         ` [dpdk-dev] [PATCH v5 00/19] " Cunming Liang
                             ` (5 preceding siblings ...)
  2015-02-12  8:16           ` [dpdk-dev] [PATCH v5 06/19] eal: new TLS definition and API declaration Cunming Liang
@ 2015-02-12  8:16           ` Cunming Liang
  2015-02-12  8:16           ` [dpdk-dev] [PATCH v5 08/19] eal: standardize init sequence between linux and bsd Cunming Liang
                             ` (12 subsequent siblings)
  19 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-02-12  8:16 UTC (permalink / raw)
  To: dev

The API works for both EAL thread and none EAL thread.
When calling rte_thread_set_affinity, the *_socket_id* and
*_cpuset* of calling thread will be updated if the thread
successful set the cpu affinity.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 v5 changes:
   refine code of rte_thread_set_affinity()
   change rte_thread_get_affinity() return to void

 lib/librte_eal/bsdapp/eal/Makefile        |   1 +
 lib/librte_eal/common/eal_common_thread.c | 150 ++++++++++++++++++++++++++++++
 lib/librte_eal/linuxapp/eal/Makefile      |   2 +
 3 files changed, 153 insertions(+)
 create mode 100644 lib/librte_eal/common/eal_common_thread.c

diff --git a/lib/librte_eal/bsdapp/eal/Makefile b/lib/librte_eal/bsdapp/eal/Makefile
index ae214a4..2357cfa 100644
--- a/lib/librte_eal/bsdapp/eal/Makefile
+++ b/lib/librte_eal/bsdapp/eal/Makefile
@@ -77,6 +77,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_hexdump.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_devargs.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_dev.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_options.c
+SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_thread.c
 
 CFLAGS_eal.o := -D_GNU_SOURCE
 #CFLAGS_eal_thread.o := -D_GNU_SOURCE
diff --git a/lib/librte_eal/common/eal_common_thread.c b/lib/librte_eal/common/eal_common_thread.c
new file mode 100644
index 0000000..f4d9892
--- /dev/null
+++ b/lib/librte_eal/common/eal_common_thread.c
@@ -0,0 +1,150 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <unistd.h>
+#include <pthread.h>
+#include <sched.h>
+#include <assert.h>
+#include <string.h>
+
+#include <rte_lcore.h>
+#include <rte_memory.h>
+#include <rte_log.h>
+
+#include "eal_thread.h"
+
+int eal_cpuset_socket_id(rte_cpuset_t *cpusetp)
+{
+	unsigned cpu = 0;
+	int socket_id = SOCKET_ID_ANY;
+	int sid;
+
+	if (cpusetp == NULL)
+		return SOCKET_ID_ANY;
+
+	do {
+		if (!CPU_ISSET(cpu, cpusetp))
+			continue;
+
+		if (socket_id == SOCKET_ID_ANY)
+			socket_id = eal_cpu_socket_id(cpu);
+
+		sid = eal_cpu_socket_id(cpu);
+		if (socket_id != sid) {
+			socket_id = SOCKET_ID_ANY;
+			break;
+		}
+
+	} while (++cpu < RTE_MAX_LCORE);
+
+	return socket_id;
+}
+
+int
+rte_thread_set_affinity(rte_cpuset_t *cpusetp)
+{
+	int s;
+	unsigned lcore_id;
+	pthread_t tid;
+
+	tid = pthread_self();
+
+	s = pthread_setaffinity_np(tid, sizeof(rte_cpuset_t), cpusetp);
+	if (s != 0) {
+		RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
+		return -1;
+	}
+
+	/* store socket_id in TLS for quick access */
+	RTE_PER_LCORE(_socket_id) =
+		eal_cpuset_socket_id(cpusetp);
+
+	/* store cpuset in TLS for quick access */
+	memmove(&RTE_PER_LCORE(_cpuset), cpusetp,
+		sizeof(rte_cpuset_t));
+
+	lcore_id = rte_lcore_id();
+	if (lcore_id != (unsigned)LCORE_ID_ANY) {
+		/* EAL thread will update lcore_config */
+		lcore_config[lcore_id].socket_id = RTE_PER_LCORE(_socket_id);
+		memmove(&lcore_config[lcore_id].cpuset, cpusetp,
+			sizeof(rte_cpuset_t));
+	}
+
+	return 0;
+}
+
+void
+rte_thread_get_affinity(rte_cpuset_t *cpusetp)
+{
+	assert(cpusetp);
+	memmove(cpusetp, &RTE_PER_LCORE(_cpuset),
+		sizeof(rte_cpuset_t));
+}
+
+int
+eal_thread_dump_affinity(char *str, unsigned size)
+{
+	rte_cpuset_t cpuset;
+	unsigned cpu;
+	int ret;
+	unsigned int out = 0;
+
+	rte_thread_get_affinity(&cpuset);
+
+	for (cpu = 0; cpu < RTE_MAX_LCORE; cpu++) {
+		if (!CPU_ISSET(cpu, &cpuset))
+			continue;
+
+		ret = snprintf(str + out,
+			       size - out, "%u,", cpu);
+		if (ret < 0 || (unsigned)ret >= size - out) {
+			/* string will be truncated */
+			ret = -1;
+			goto exit;
+		}
+
+		out += ret;
+	}
+
+	ret = 0;
+exit:
+	/* remove the last separator */
+	if (out > 0)
+		str[out - 1] = '\0';
+
+	return ret;
+}
diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile
index 4da9d27..23c2d48 100644
--- a/lib/librte_eal/linuxapp/eal/Makefile
+++ b/lib/librte_eal/linuxapp/eal/Makefile
@@ -89,6 +89,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_common_hexdump.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_common_devargs.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_common_dev.c
 SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_common_options.c
+SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_common_thread.c
 
 CFLAGS_eal.o := -D_GNU_SOURCE
 CFLAGS_eal_lcore.o := -D_GNU_SOURCE
@@ -100,6 +101,7 @@ CFLAGS_eal_pci.o := -D_GNU_SOURCE
 CFLAGS_eal_pci_vfio.o := -D_GNU_SOURCE
 CFLAGS_eal_common_whitelist.o := -D_GNU_SOURCE
 CFLAGS_eal_common_options.o := -D_GNU_SOURCE
+CFLAGS_eal_common_thread.o := -D_GNU_SOURCE
 
 # workaround for a gcc bug with noreturn attribute
 # http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v5 08/19] eal: standardize init sequence between linux and bsd
  2015-02-12  8:16         ` [dpdk-dev] [PATCH v5 00/19] " Cunming Liang
                             ` (6 preceding siblings ...)
  2015-02-12  8:16           ` [dpdk-dev] [PATCH v5 07/19] eal: add eal_common_thread.c for common thread API Cunming Liang
@ 2015-02-12  8:16           ` Cunming Liang
  2015-02-12  8:16           ` [dpdk-dev] [PATCH v5 09/19] eal: add rte_gettid() to acquire unique system tid Cunming Liang
                             ` (11 subsequent siblings)
  19 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-02-12  8:16 UTC (permalink / raw)
  To: dev


Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/bsdapp/eal/eal.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
index 69f3c03..cb11b5c 100644
--- a/lib/librte_eal/bsdapp/eal/eal.c
+++ b/lib/librte_eal/bsdapp/eal/eal.c
@@ -509,6 +509,8 @@ rte_eal_init(int argc, char **argv)
 
 	rte_eal_mcfg_complete();
 
+	eal_thread_init_master(rte_config.master_lcore);
+
 	if (rte_eal_dev_init() < 0)
 		rte_panic("Cannot init pmd devices\n");
 
@@ -532,8 +534,6 @@ rte_eal_init(int argc, char **argv)
 			rte_panic("Cannot create thread\n");
 	}
 
-	eal_thread_init_master(rte_config.master_lcore);
-
 	/*
 	 * Launch a dummy function on all slave lcores, so that master lcore
 	 * knows they are all ready when this function returns.
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v5 09/19] eal: add rte_gettid() to acquire unique system tid
  2015-02-12  8:16         ` [dpdk-dev] [PATCH v5 00/19] " Cunming Liang
                             ` (7 preceding siblings ...)
  2015-02-12  8:16           ` [dpdk-dev] [PATCH v5 08/19] eal: standardize init sequence between linux and bsd Cunming Liang
@ 2015-02-12  8:16           ` Cunming Liang
  2015-02-12  8:16           ` [dpdk-dev] [PATCH v5 10/19] eal: apply affinity of EAL thread by assigned cpuset Cunming Liang
                             ` (10 subsequent siblings)
  19 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-02-12  8:16 UTC (permalink / raw)
  To: dev

The rte_gettid() wraps the linux and freebsd syscall gettid().
It provides a persistent unique thread id for the calling thread.
It will save the unique id in TLS on the first time.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/bsdapp/eal/eal_thread.c   |  9 +++++++++
 lib/librte_eal/common/include/rte_eal.h  | 27 +++++++++++++++++++++++++++
 lib/librte_eal/linuxapp/eal/eal_thread.c |  7 +++++++
 3 files changed, 43 insertions(+)

diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c b/lib/librte_eal/bsdapp/eal/eal_thread.c
index 10220c7..d0c077b 100644
--- a/lib/librte_eal/bsdapp/eal/eal_thread.c
+++ b/lib/librte_eal/bsdapp/eal/eal_thread.c
@@ -39,6 +39,7 @@
 #include <sched.h>
 #include <pthread_np.h>
 #include <sys/queue.h>
+#include <sys/thr.h>
 
 #include <rte_debug.h>
 #include <rte_atomic.h>
@@ -233,3 +234,11 @@ eal_thread_loop(__attribute__((unused)) void *arg)
 	/* pthread_exit(NULL); */
 	/* return NULL; */
 }
+
+/* require calling thread tid by gettid() */
+int rte_sys_gettid(void)
+{
+	long lwpid;
+	thr_self(&lwpid);
+	return (int)lwpid;
+}
diff --git a/lib/librte_eal/common/include/rte_eal.h b/lib/librte_eal/common/include/rte_eal.h
index f4ecd2e..8ccdd65 100644
--- a/lib/librte_eal/common/include/rte_eal.h
+++ b/lib/librte_eal/common/include/rte_eal.h
@@ -41,6 +41,9 @@
  */
 
 #include <stdint.h>
+#include <sched.h>
+
+#include <rte_per_lcore.h>
 
 #ifdef __cplusplus
 extern "C" {
@@ -262,6 +265,30 @@ rte_set_application_usage_hook( rte_usage_hook_t usage_func );
  */
 int rte_eal_has_hugepages(void);
 
+/**
+ * A wrap API for syscall gettid.
+ *
+ * @return
+ *   On success, returns the thread ID of calling process.
+ *   It always successful.
+ */
+int rte_sys_gettid(void);
+
+/**
+ * Get system unique thread id.
+ *
+ * @return
+ *   On success, returns the thread ID of calling process.
+ *   It always successful.
+ */
+static inline int rte_gettid(void)
+{
+	static RTE_DEFINE_PER_LCORE(int, _thread_id) = -1;
+	if (RTE_PER_LCORE(_thread_id) == -1)
+		RTE_PER_LCORE(_thread_id) = rte_sys_gettid();
+	return RTE_PER_LCORE(_thread_id);
+}
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_eal/linuxapp/eal/eal_thread.c b/lib/librte_eal/linuxapp/eal/eal_thread.c
index 748a83a..ed20c93 100644
--- a/lib/librte_eal/linuxapp/eal/eal_thread.c
+++ b/lib/librte_eal/linuxapp/eal/eal_thread.c
@@ -39,6 +39,7 @@
 #include <pthread.h>
 #include <sched.h>
 #include <sys/queue.h>
+#include <sys/syscall.h>
 
 #include <rte_debug.h>
 #include <rte_atomic.h>
@@ -233,3 +234,9 @@ eal_thread_loop(__attribute__((unused)) void *arg)
 	/* pthread_exit(NULL); */
 	/* return NULL; */
 }
+
+/* require calling thread tid by gettid() */
+int rte_sys_gettid(void)
+{
+	return (int)syscall(SYS_gettid);
+}
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v5 10/19] eal: apply affinity of EAL thread by assigned cpuset
  2015-02-12  8:16         ` [dpdk-dev] [PATCH v5 00/19] " Cunming Liang
                             ` (8 preceding siblings ...)
  2015-02-12  8:16           ` [dpdk-dev] [PATCH v5 09/19] eal: add rte_gettid() to acquire unique system tid Cunming Liang
@ 2015-02-12  8:16           ` Cunming Liang
  2015-02-12  8:16           ` [dpdk-dev] [PATCH v5 11/19] enic: fix re-define freebsd compile complain Cunming Liang
                             ` (9 subsequent siblings)
  19 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-02-12  8:16 UTC (permalink / raw)
  To: dev

EAL threads use assigned cpuset to set core affinity during startup.
It keeps 1:1 mapping, if no '--lcores' option is used.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 v5 changes:
   add return check for dump_affinity
   call rte_thread_set_affinity() directly during EAL thread set

 lib/librte_eal/bsdapp/eal/eal.c           | 10 +++--
 lib/librte_eal/bsdapp/eal/eal_thread.c    | 64 ++++++------------------------
 lib/librte_eal/common/include/rte_lcore.h |  2 +-
 lib/librte_eal/linuxapp/eal/eal.c         |  8 +++-
 lib/librte_eal/linuxapp/eal/eal_thread.c  | 66 ++++++-------------------------
 5 files changed, 37 insertions(+), 113 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
index cb11b5c..b66f6c6 100644
--- a/lib/librte_eal/bsdapp/eal/eal.c
+++ b/lib/librte_eal/bsdapp/eal/eal.c
@@ -432,6 +432,7 @@ rte_eal_init(int argc, char **argv)
 	int i, fctret, ret;
 	pthread_t thread_id;
 	static rte_atomic32_t run_once = RTE_ATOMIC32_INIT(0);
+	char cpuset[RTE_CPU_AFFINITY_STR_LEN];
 
 	if (!rte_atomic32_test_and_set(&run_once))
 		return -1;
@@ -502,15 +503,18 @@ rte_eal_init(int argc, char **argv)
 	if (rte_eal_pci_init() < 0)
 		rte_panic("Cannot init PCI\n");
 
-	RTE_LOG(DEBUG, EAL, "Master core %u is ready (tid=%p)\n",
-		rte_config.master_lcore, thread_id);
-
 	eal_check_mem_on_local_socket();
 
 	rte_eal_mcfg_complete();
 
 	eal_thread_init_master(rte_config.master_lcore);
 
+	ret = eal_thread_dump_affinity(cpuset, RTE_CPU_AFFINITY_STR_LEN);
+
+	RTE_LOG(DEBUG, EAL, "Master lcore %u is ready (tid=%p;cpuset=[%s%s])\n",
+		rte_config.master_lcore, thread_id, cpuset,
+		ret == 0 ? "" : "...");
+
 	if (rte_eal_dev_init() < 0)
 		rte_panic("Cannot init pmd devices\n");
 
diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c b/lib/librte_eal/bsdapp/eal/eal_thread.c
index d0c077b..e16f685 100644
--- a/lib/librte_eal/bsdapp/eal/eal_thread.c
+++ b/lib/librte_eal/bsdapp/eal/eal_thread.c
@@ -101,58 +101,13 @@ rte_eal_remote_launch(int (*f)(void *), void *arg, unsigned slave_id)
 static int
 eal_thread_set_affinity(void)
 {
-	int s;
-	pthread_t thread;
+	unsigned lcore_id = rte_lcore_id();
 
-/*
- * According to the section VERSIONS of the CPU_ALLOC man page:
- *
- * The CPU_ZERO(), CPU_SET(), CPU_CLR(), and CPU_ISSET() macros were added
- * in glibc 2.3.3.
- *
- * CPU_COUNT() first appeared in glibc 2.6.
- *
- * CPU_AND(),     CPU_OR(),     CPU_XOR(),    CPU_EQUAL(),    CPU_ALLOC(),
- * CPU_ALLOC_SIZE(), CPU_FREE(), CPU_ZERO_S(),  CPU_SET_S(),  CPU_CLR_S(),
- * CPU_ISSET_S(),  CPU_AND_S(), CPU_OR_S(), CPU_XOR_S(), and CPU_EQUAL_S()
- * first appeared in glibc 2.7.
- */
-#if defined(CPU_ALLOC)
-	size_t size;
-	cpu_set_t *cpusetp;
-
-	cpusetp = CPU_ALLOC(RTE_MAX_LCORE);
-	if (cpusetp == NULL) {
-		RTE_LOG(ERR, EAL, "CPU_ALLOC failed\n");
-		return -1;
-	}
-
-	size = CPU_ALLOC_SIZE(RTE_MAX_LCORE);
-	CPU_ZERO_S(size, cpusetp);
-	CPU_SET_S(rte_lcore_id(), size, cpusetp);
+	/* acquire system unique id  */
+	rte_gettid();
 
-	thread = pthread_self();
-	s = pthread_setaffinity_np(thread, size, cpusetp);
-	if (s != 0) {
-		RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
-		CPU_FREE(cpusetp);
-		return -1;
-	}
-
-	CPU_FREE(cpusetp);
-#else /* CPU_ALLOC */
-	cpuset_t cpuset;
-	CPU_ZERO( &cpuset );
-	CPU_SET( rte_lcore_id(), &cpuset );
-
-	thread = pthread_self();
-	s = pthread_setaffinity_np(thread, sizeof( cpuset ), &cpuset);
-	if (s != 0) {
-		RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
-		return -1;
-	}
-#endif
-	return 0;
+	/* update EAL thread core affinity */
+	return rte_thread_set_affinity(&lcore_config[lcore_id].cpuset);
 }
 
 void eal_thread_init_master(unsigned lcore_id)
@@ -174,6 +129,7 @@ eal_thread_loop(__attribute__((unused)) void *arg)
 	unsigned lcore_id;
 	pthread_t thread_id;
 	int m2s, s2m;
+	char cpuset[RTE_CPU_AFFINITY_STR_LEN];
 
 	thread_id = pthread_self();
 
@@ -185,9 +141,6 @@ eal_thread_loop(__attribute__((unused)) void *arg)
 	if (lcore_id == RTE_MAX_LCORE)
 		rte_panic("cannot retrieve lcore id\n");
 
-	RTE_LOG(DEBUG, EAL, "Core %u is ready (tid=%p)\n",
-		lcore_id, thread_id);
-
 	m2s = lcore_config[lcore_id].pipe_master2slave[0];
 	s2m = lcore_config[lcore_id].pipe_slave2master[1];
 
@@ -198,6 +151,11 @@ eal_thread_loop(__attribute__((unused)) void *arg)
 	if (eal_thread_set_affinity() < 0)
 		rte_panic("cannot set affinity\n");
 
+	ret = eal_thread_dump_affinity(cpuset, RTE_CPU_AFFINITY_STR_LEN);
+
+	RTE_LOG(DEBUG, EAL, "lcore %u is ready (tid=%p;cpuset=[%s%s])\n",
+		lcore_id, thread_id, cpuset, ret == 0 ? "" : "...");
+
 	/* read on our pipe to get commands */
 	while (1) {
 		void *fct_arg;
diff --git a/lib/librte_eal/common/include/rte_lcore.h b/lib/librte_eal/common/include/rte_lcore.h
index 33f558e..6a5bcbc 100644
--- a/lib/librte_eal/common/include/rte_lcore.h
+++ b/lib/librte_eal/common/include/rte_lcore.h
@@ -148,7 +148,7 @@ rte_lcore_index(int lcore_id)
 static inline unsigned
 rte_socket_id(void)
 {
-	return lcore_config[rte_lcore_id()].socket_id;
+	return RTE_PER_LCORE(_socket_id);
 }
 
 /**
diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index f99e158..137a16c 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -702,6 +702,7 @@ rte_eal_init(int argc, char **argv)
 	static rte_atomic32_t run_once = RTE_ATOMIC32_INIT(0);
 	struct shared_driver *solib = NULL;
 	const char *logid;
+	char cpuset[RTE_CPU_AFFINITY_STR_LEN];
 
 	if (!rte_atomic32_test_and_set(&run_once))
 		return -1;
@@ -802,8 +803,11 @@ rte_eal_init(int argc, char **argv)
 
 	eal_thread_init_master(rte_config.master_lcore);
 
-	RTE_LOG(DEBUG, EAL, "Master core %u is ready (tid=%x)\n",
-		rte_config.master_lcore, (int)thread_id);
+	ret = eal_thread_dump_affinity(cpuset, RTE_CPU_AFFINITY_STR_LEN);
+
+	RTE_LOG(DEBUG, EAL, "Master lcore %u is ready (tid=%x;cpuset=[%s%s])\n",
+		rte_config.master_lcore, (int)thread_id, cpuset,
+		ret == 0 ? "" : "...");
 
 	if (rte_eal_dev_init() < 0)
 		rte_panic("Cannot init pmd devices\n");
diff --git a/lib/librte_eal/linuxapp/eal/eal_thread.c b/lib/librte_eal/linuxapp/eal/eal_thread.c
index ed20c93..57b0515 100644
--- a/lib/librte_eal/linuxapp/eal/eal_thread.c
+++ b/lib/librte_eal/linuxapp/eal/eal_thread.c
@@ -97,62 +97,17 @@ rte_eal_remote_launch(int (*f)(void *), void *arg, unsigned slave_id)
 	return 0;
 }
 
-/* set affinity for current thread */
+/* set affinity for current EAL thread */
 static int
 eal_thread_set_affinity(void)
 {
-	int s;
-	pthread_t thread;
+	unsigned lcore_id = rte_lcore_id();
 
-/*
- * According to the section VERSIONS of the CPU_ALLOC man page:
- *
- * The CPU_ZERO(), CPU_SET(), CPU_CLR(), and CPU_ISSET() macros were added
- * in glibc 2.3.3.
- *
- * CPU_COUNT() first appeared in glibc 2.6.
- *
- * CPU_AND(),     CPU_OR(),     CPU_XOR(),    CPU_EQUAL(),    CPU_ALLOC(),
- * CPU_ALLOC_SIZE(), CPU_FREE(), CPU_ZERO_S(),  CPU_SET_S(),  CPU_CLR_S(),
- * CPU_ISSET_S(),  CPU_AND_S(), CPU_OR_S(), CPU_XOR_S(), and CPU_EQUAL_S()
- * first appeared in glibc 2.7.
- */
-#if defined(CPU_ALLOC)
-	size_t size;
-	cpu_set_t *cpusetp;
-
-	cpusetp = CPU_ALLOC(RTE_MAX_LCORE);
-	if (cpusetp == NULL) {
-		RTE_LOG(ERR, EAL, "CPU_ALLOC failed\n");
-		return -1;
-	}
-
-	size = CPU_ALLOC_SIZE(RTE_MAX_LCORE);
-	CPU_ZERO_S(size, cpusetp);
-	CPU_SET_S(rte_lcore_id(), size, cpusetp);
+	/* acquire system unique id  */
+	rte_gettid();
 
-	thread = pthread_self();
-	s = pthread_setaffinity_np(thread, size, cpusetp);
-	if (s != 0) {
-		RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
-		CPU_FREE(cpusetp);
-		return -1;
-	}
-
-	CPU_FREE(cpusetp);
-#else /* CPU_ALLOC */
-	cpu_set_t cpuset;
-	CPU_ZERO( &cpuset );
-	CPU_SET( rte_lcore_id(), &cpuset );
-
-	thread = pthread_self();
-	s = pthread_setaffinity_np(thread, sizeof( cpuset ), &cpuset);
-	if (s != 0) {
-		RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n");
-		return -1;
-	}
-#endif
-	return 0;
+	/* update EAL thread core affinity */
+	return rte_thread_set_affinity(&lcore_config[lcore_id].cpuset);
 }
 
 void eal_thread_init_master(unsigned lcore_id)
@@ -174,6 +129,7 @@ eal_thread_loop(__attribute__((unused)) void *arg)
 	unsigned lcore_id;
 	pthread_t thread_id;
 	int m2s, s2m;
+	char cpuset[RTE_CPU_AFFINITY_STR_LEN];
 
 	thread_id = pthread_self();
 
@@ -185,9 +141,6 @@ eal_thread_loop(__attribute__((unused)) void *arg)
 	if (lcore_id == RTE_MAX_LCORE)
 		rte_panic("cannot retrieve lcore id\n");
 
-	RTE_LOG(DEBUG, EAL, "Core %u is ready (tid=%x)\n",
-		lcore_id, (int)thread_id);
-
 	m2s = lcore_config[lcore_id].pipe_master2slave[0];
 	s2m = lcore_config[lcore_id].pipe_slave2master[1];
 
@@ -198,6 +151,11 @@ eal_thread_loop(__attribute__((unused)) void *arg)
 	if (eal_thread_set_affinity() < 0)
 		rte_panic("cannot set affinity\n");
 
+	ret = eal_thread_dump_affinity(cpuset, RTE_CPU_AFFINITY_STR_LEN);
+
+	RTE_LOG(DEBUG, EAL, "lcore %u is ready (tid=%x;cpuset=[%s%s])\n",
+		lcore_id, (int)thread_id, cpuset, ret == 0 ? "" : "...");
+
 	/* read on our pipe to get commands */
 	while (1) {
 		void *fct_arg;
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v5 11/19] enic: fix re-define freebsd compile complain
  2015-02-12  8:16         ` [dpdk-dev] [PATCH v5 00/19] " Cunming Liang
                             ` (9 preceding siblings ...)
  2015-02-12  8:16           ` [dpdk-dev] [PATCH v5 10/19] eal: apply affinity of EAL thread by assigned cpuset Cunming Liang
@ 2015-02-12  8:16           ` Cunming Liang
  2015-02-12  8:16           ` [dpdk-dev] [PATCH v5 12/19] malloc: fix the issue of SOCKET_ID_ANY Cunming Liang
                             ` (8 subsequent siblings)
  19 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-02-12  8:16 UTC (permalink / raw)
  To: dev

Some macro already been defined by freebsd 'sys/param.h'.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 v5 changes:
   rename the redefined MACRO instead of undefine them

 lib/librte_pmd_enic/enic.h          | 4 ++--
 lib/librte_pmd_enic/enic_compat.h   | 2 +-
 lib/librte_pmd_enic/vnic/vnic_dev.c | 6 +++---
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/lib/librte_pmd_enic/enic.h b/lib/librte_pmd_enic/enic.h
index c43417c..57b9c80 100644
--- a/lib/librte_pmd_enic/enic.h
+++ b/lib/librte_pmd_enic/enic.h
@@ -66,9 +66,9 @@
 #define ENIC_CALC_IP_CKSUM      1
 #define ENIC_CALC_TCP_UDP_CKSUM 2
 #define ENIC_MAX_MTU            9000
-#define PAGE_SIZE               4096
+#define ENIC_PAGE_SIZE          4096
 #define PAGE_ROUND_UP(x) \
-	((((unsigned long)(x)) + PAGE_SIZE-1) & (~(PAGE_SIZE-1)))
+	((((unsigned long)(x)) + ENIC_PAGE_SIZE-1) & (~(ENIC_PAGE_SIZE-1)))
 
 #define ENICPMD_VFIO_PATH          "/dev/vfio/vfio"
 /*#define ENIC_DESC_COUNT_MAKE_ODD (x) do{if ((~(x)) & 1) { (x)--; } }while(0)*/
diff --git a/lib/librte_pmd_enic/enic_compat.h b/lib/librte_pmd_enic/enic_compat.h
index b1af838..40c9b44 100644
--- a/lib/librte_pmd_enic/enic_compat.h
+++ b/lib/librte_pmd_enic/enic_compat.h
@@ -67,7 +67,7 @@
 #define pr_warn(y, args...) dev_warning(0, y, ##args)
 #define BUG() pr_err("BUG at %s:%d", __func__, __LINE__)
 
-#define ALIGN(x, a)              __ALIGN_MASK(x, (typeof(x))(a)-1)
+#define VNIC_ALIGN(x, a)         __ALIGN_MASK(x, (typeof(x))(a)-1)
 #define __ALIGN_MASK(x, mask)    (((x)+(mask))&~(mask))
 #define udelay usleep
 #define DIV_ROUND_UP(n, d) (((n) + (d) - 1) / (d))
diff --git a/lib/librte_pmd_enic/vnic/vnic_dev.c b/lib/librte_pmd_enic/vnic/vnic_dev.c
index 6407994..38b7f25 100644
--- a/lib/librte_pmd_enic/vnic/vnic_dev.c
+++ b/lib/librte_pmd_enic/vnic/vnic_dev.c
@@ -242,9 +242,9 @@ unsigned int vnic_dev_desc_ring_size(struct vnic_dev_ring *ring,
 	if (desc_count == 0)
 		desc_count = 4096;
 
-	ring->desc_count = ALIGN(desc_count, count_align);
+	ring->desc_count = VNIC_ALIGN(desc_count, count_align);
 
-	ring->desc_size = ALIGN(desc_size, desc_align);
+	ring->desc_size = VNIC_ALIGN(desc_size, desc_align);
 
 	ring->size = ring->desc_count * ring->desc_size;
 	ring->size_unaligned = ring->size + ring->base_align;
@@ -294,7 +294,7 @@ int vnic_dev_alloc_desc_ring(__attribute__((unused)) struct vnic_dev *vdev,
 
 	ring->base_addr_unaligned = (dma_addr_t)rz->phys_addr;
 
-	ring->base_addr = ALIGN(ring->base_addr_unaligned,
+	ring->base_addr = VNIC_ALIGN(ring->base_addr_unaligned,
 		ring->base_align);
 	ring->descs = (u8 *)ring->descs_unaligned +
 	    (ring->base_addr - ring->base_addr_unaligned);
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v5 12/19] malloc: fix the issue of SOCKET_ID_ANY
  2015-02-12  8:16         ` [dpdk-dev] [PATCH v5 00/19] " Cunming Liang
                             ` (10 preceding siblings ...)
  2015-02-12  8:16           ` [dpdk-dev] [PATCH v5 11/19] enic: fix re-define freebsd compile complain Cunming Liang
@ 2015-02-12  8:16           ` Cunming Liang
  2015-02-12  8:16           ` [dpdk-dev] [PATCH v5 13/19] log: fix the gap to support non-EAL thread Cunming Liang
                             ` (7 subsequent siblings)
  19 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-02-12  8:16 UTC (permalink / raw)
  To: dev

Add check for rte_socket_id(), avoid get unexpected return like (-1).

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_malloc/malloc_heap.h | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/lib/librte_malloc/malloc_heap.h b/lib/librte_malloc/malloc_heap.h
index b4aec45..a47136d 100644
--- a/lib/librte_malloc/malloc_heap.h
+++ b/lib/librte_malloc/malloc_heap.h
@@ -44,7 +44,12 @@ extern "C" {
 static inline unsigned
 malloc_get_numa_socket(void)
 {
-	return rte_socket_id();
+	unsigned socket_id = rte_socket_id();
+
+	if (socket_id == (unsigned)SOCKET_ID_ANY)
+		return 0;
+
+	return socket_id;
 }
 
 void *
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v5 13/19] log: fix the gap to support non-EAL thread
  2015-02-12  8:16         ` [dpdk-dev] [PATCH v5 00/19] " Cunming Liang
                             ` (11 preceding siblings ...)
  2015-02-12  8:16           ` [dpdk-dev] [PATCH v5 12/19] malloc: fix the issue of SOCKET_ID_ANY Cunming Liang
@ 2015-02-12  8:16           ` Cunming Liang
  2015-02-12  8:16           ` [dpdk-dev] [PATCH v5 14/19] eal: set _lcore_id and _socket_id to (-1) by default Cunming Liang
                             ` (6 subsequent siblings)
  19 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-02-12  8:16 UTC (permalink / raw)
  To: dev

For those non-EAL thread, *_lcore_id* is invalid and probably larger than RTE_MAX_LCORE.
The patch adds the check and allows only EAL thread using EAL per thread log level and log type.
Others shares the global log level.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/common/eal_common_log.c  | 17 +++++++++++++++--
 lib/librte_eal/common/include/rte_log.h |  5 +++++
 2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_log.c b/lib/librte_eal/common/eal_common_log.c
index cf57619..e8dc94a 100644
--- a/lib/librte_eal/common/eal_common_log.c
+++ b/lib/librte_eal/common/eal_common_log.c
@@ -193,11 +193,20 @@ rte_set_log_type(uint32_t type, int enable)
 		rte_logs.type &= (~type);
 }
 
+/* Get global log type */
+uint32_t
+rte_get_log_type(void)
+{
+	return rte_logs.type;
+}
+
 /* get the current loglevel for the message beeing processed */
 int rte_log_cur_msg_loglevel(void)
 {
 	unsigned lcore_id;
 	lcore_id = rte_lcore_id();
+	if (lcore_id >= RTE_MAX_LCORE)
+		return rte_get_log_level();
 	return log_cur_msg[lcore_id].loglevel;
 }
 
@@ -206,6 +215,8 @@ int rte_log_cur_msg_logtype(void)
 {
 	unsigned lcore_id;
 	lcore_id = rte_lcore_id();
+	if (lcore_id >= RTE_MAX_LCORE)
+		return rte_get_log_type();
 	return log_cur_msg[lcore_id].logtype;
 }
 
@@ -265,8 +276,10 @@ rte_vlog(__attribute__((unused)) uint32_t level,
 
 	/* save loglevel and logtype in a global per-lcore variable */
 	lcore_id = rte_lcore_id();
-	log_cur_msg[lcore_id].loglevel = level;
-	log_cur_msg[lcore_id].logtype = logtype;
+	if (lcore_id < RTE_MAX_LCORE) {
+		log_cur_msg[lcore_id].loglevel = level;
+		log_cur_msg[lcore_id].logtype = logtype;
+	}
 
 	ret = vfprintf(f, format, ap);
 	fflush(f);
diff --git a/lib/librte_eal/common/include/rte_log.h b/lib/librte_eal/common/include/rte_log.h
index db1ea08..f83a0d9 100644
--- a/lib/librte_eal/common/include/rte_log.h
+++ b/lib/librte_eal/common/include/rte_log.h
@@ -144,6 +144,11 @@ uint32_t rte_get_log_level(void);
 void rte_set_log_type(uint32_t type, int enable);
 
 /**
+ * Get the global log type.
+ */
+uint32_t rte_get_log_type(void);
+
+/**
  * Get the current loglevel for the message being processed.
  *
  * Before calling the user-defined stream for logging, the log
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v5 14/19] eal: set _lcore_id and _socket_id to (-1) by default
  2015-02-12  8:16         ` [dpdk-dev] [PATCH v5 00/19] " Cunming Liang
                             ` (12 preceding siblings ...)
  2015-02-12  8:16           ` [dpdk-dev] [PATCH v5 13/19] log: fix the gap to support non-EAL thread Cunming Liang
@ 2015-02-12  8:16           ` Cunming Liang
  2015-02-12  8:16           ` [dpdk-dev] [PATCH v5 15/19] eal: fix recursive spinlock in non-EAL thraed Cunming Liang
                             ` (5 subsequent siblings)
  19 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-02-12  8:16 UTC (permalink / raw)
  To: dev

For those none EAL thread, *_lcore_id* shall always be LCORE_ID_ANY.
The libraries using *_lcore_id* as index need to take care.
*_socket_id* always be SOCKET_ID_ANY unitl the thread changes the affinity
by rte_thread_set_affinity()

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 v5 changes:
    define LCORE_ID_ANY as UINT32_MAX      

 lib/librte_eal/bsdapp/eal/eal_thread.c    | 4 ++--
 lib/librte_eal/common/include/rte_lcore.h | 4 ++--
 lib/librte_eal/linuxapp/eal/eal_thread.c  | 4 ++--
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c b/lib/librte_eal/bsdapp/eal/eal_thread.c
index e16f685..ca95c72 100644
--- a/lib/librte_eal/bsdapp/eal/eal_thread.c
+++ b/lib/librte_eal/bsdapp/eal/eal_thread.c
@@ -56,8 +56,8 @@
 #include "eal_private.h"
 #include "eal_thread.h"
 
-RTE_DEFINE_PER_LCORE(unsigned, _lcore_id);
-RTE_DEFINE_PER_LCORE(unsigned, _socket_id);
+RTE_DEFINE_PER_LCORE(unsigned, _lcore_id) = LCORE_ID_ANY;
+RTE_DEFINE_PER_LCORE(unsigned, _socket_id) = (unsigned)SOCKET_ID_ANY;
 RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset);
 
 /*
diff --git a/lib/librte_eal/common/include/rte_lcore.h b/lib/librte_eal/common/include/rte_lcore.h
index 6a5bcbc..ad47221 100644
--- a/lib/librte_eal/common/include/rte_lcore.h
+++ b/lib/librte_eal/common/include/rte_lcore.h
@@ -48,7 +48,7 @@
 extern "C" {
 #endif
 
-#define LCORE_ID_ANY -1    /**< Any lcore. */
+#define LCORE_ID_ANY     UINT32_MAX       /**< Any lcore. */
 
 #if defined(__linux__)
 	typedef	cpu_set_t rte_cpuset_t;
@@ -87,7 +87,7 @@ RTE_DECLARE_PER_LCORE(rte_cpuset_t, _cpuset); /**< Per thread "cpuset". */
 /**
  * Return the ID of the execution unit we are running on.
  * @return
- *  Logical core ID
+ *  Logical core ID(in EAL thread) or LCORE_ID_ANY(in non-EAL thread)
  */
 static inline unsigned
 rte_lcore_id(void)
diff --git a/lib/librte_eal/linuxapp/eal/eal_thread.c b/lib/librte_eal/linuxapp/eal/eal_thread.c
index 57b0515..5635c7d 100644
--- a/lib/librte_eal/linuxapp/eal/eal_thread.c
+++ b/lib/librte_eal/linuxapp/eal/eal_thread.c
@@ -56,8 +56,8 @@
 #include "eal_private.h"
 #include "eal_thread.h"
 
-RTE_DEFINE_PER_LCORE(unsigned, _lcore_id);
-RTE_DEFINE_PER_LCORE(unsigned, _socket_id);
+RTE_DEFINE_PER_LCORE(unsigned, _lcore_id) = LCORE_ID_ANY;
+RTE_DEFINE_PER_LCORE(unsigned, _socket_id) = (unsigned)SOCKET_ID_ANY;
 RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset);
 
 /*
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v5 15/19] eal: fix recursive spinlock in non-EAL thraed
  2015-02-12  8:16         ` [dpdk-dev] [PATCH v5 00/19] " Cunming Liang
                             ` (13 preceding siblings ...)
  2015-02-12  8:16           ` [dpdk-dev] [PATCH v5 14/19] eal: set _lcore_id and _socket_id to (-1) by default Cunming Liang
@ 2015-02-12  8:16           ` Cunming Liang
  2015-02-12  8:16           ` [dpdk-dev] [PATCH v5 16/19] mempool: add support to non-EAL thread Cunming Liang
                             ` (4 subsequent siblings)
  19 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-02-12  8:16 UTC (permalink / raw)
  To: dev

In non-EAL thread, lcore_id alrways be LCORE_ID_ANY.
It cann't be used as unique id for recursive spinlock.
Then use rte_gettid() to replace it.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 lib/librte_eal/common/include/generic/rte_spinlock.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/include/generic/rte_spinlock.h b/lib/librte_eal/common/include/generic/rte_spinlock.h
index dea885c..c7fb0df 100644
--- a/lib/librte_eal/common/include/generic/rte_spinlock.h
+++ b/lib/librte_eal/common/include/generic/rte_spinlock.h
@@ -179,7 +179,7 @@ static inline void rte_spinlock_recursive_init(rte_spinlock_recursive_t *slr)
  */
 static inline void rte_spinlock_recursive_lock(rte_spinlock_recursive_t *slr)
 {
-	int id = rte_lcore_id();
+	int id = rte_gettid();
 
 	if (slr->user != id) {
 		rte_spinlock_lock(&slr->sl);
@@ -212,7 +212,7 @@ static inline void rte_spinlock_recursive_unlock(rte_spinlock_recursive_t *slr)
  */
 static inline int rte_spinlock_recursive_trylock(rte_spinlock_recursive_t *slr)
 {
-	int id = rte_lcore_id();
+	int id = rte_gettid();
 
 	if (slr->user != id) {
 		if (rte_spinlock_trylock(&slr->sl) == 0)
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v5 16/19] mempool: add support to non-EAL thread
  2015-02-12  8:16         ` [dpdk-dev] [PATCH v5 00/19] " Cunming Liang
                             ` (14 preceding siblings ...)
  2015-02-12  8:16           ` [dpdk-dev] [PATCH v5 15/19] eal: fix recursive spinlock in non-EAL thraed Cunming Liang
@ 2015-02-12  8:16           ` Cunming Liang
  2015-02-12  8:16           ` [dpdk-dev] [PATCH v5 17/19] ring: " Cunming Liang
                             ` (3 subsequent siblings)
  19 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-02-12  8:16 UTC (permalink / raw)
  To: dev

For non-EAL thread, bypass per lcore cache, directly use ring pool.
It allows using rte_mempool in either EAL thread or any user pthread.
As in non-EAL thread, it directly rely on rte_ring and it's none preemptive.
It doesn't suggest to run multi-pthread/cpu which compete the rte_mempool.
It will get bad performance and has critical risk if scheduling policy is RT.
Haven't found significant performance decrease by mempool_perf_test.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 v5 changes:
   check __lcore_id with LCORE_ID_ANY instead of RTE_MAX_LCORE

 lib/librte_mempool/rte_mempool.h | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 3314651..2c0b960 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -198,10 +198,12 @@ struct rte_mempool {
  *   Number to add to the object-oriented statistics.
  */
 #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
-#define __MEMPOOL_STAT_ADD(mp, name, n) do {			\
-		unsigned __lcore_id = rte_lcore_id();		\
-		mp->stats[__lcore_id].name##_objs += n;		\
-		mp->stats[__lcore_id].name##_bulk += 1;		\
+#define __MEMPOOL_STAT_ADD(mp, name, n) do {                    \
+		unsigned __lcore_id = rte_lcore_id();           \
+		if (__lcore_id != LCORE_ID_ANY) {               \
+			mp->stats[__lcore_id].name##_objs += n;	\
+			mp->stats[__lcore_id].name##_bulk += 1;	\
+		}                                               \
 	} while(0)
 #else
 #define __MEMPOOL_STAT_ADD(mp, name, n) do {} while(0)
@@ -767,8 +769,9 @@ __mempool_put_bulk(struct rte_mempool *mp, void * const *obj_table,
 	__MEMPOOL_STAT_ADD(mp, put, n);
 
 #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0
-	/* cache is not enabled or single producer */
-	if (unlikely(cache_size == 0 || is_mp == 0))
+	/* cache is not enabled or single producer or none EAL thread */
+	if (unlikely(cache_size == 0 || is_mp == 0 ||
+		     lcore_id == LCORE_ID_ANY))
 		goto ring_enqueue;
 
 	/* Go straight to ring if put would overflow mem allocated for cache */
@@ -952,7 +955,8 @@ __mempool_get_bulk(struct rte_mempool *mp, void **obj_table,
 	uint32_t cache_size = mp->cache_size;
 
 	/* cache is not enabled or single consumer */
-	if (unlikely(cache_size == 0 || is_mc == 0 || n >= cache_size))
+	if (unlikely(cache_size == 0 || is_mc == 0 ||
+		     n >= cache_size || lcore_id == LCORE_ID_ANY))
 		goto ring_dequeue;
 
 	cache = &mp->local_cache[lcore_id];
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v5 17/19] ring: add support to non-EAL thread
  2015-02-12  8:16         ` [dpdk-dev] [PATCH v5 00/19] " Cunming Liang
                             ` (15 preceding siblings ...)
  2015-02-12  8:16           ` [dpdk-dev] [PATCH v5 16/19] mempool: add support to non-EAL thread Cunming Liang
@ 2015-02-12  8:16           ` Cunming Liang
  2015-02-12  8:16           ` [dpdk-dev] [PATCH v5 18/19] ring: add sched_yield to avoid spin forever Cunming Liang
                             ` (2 subsequent siblings)
  19 siblings, 0 replies; 253+ messages in thread
From: Cunming Liang @ 2015-02-12  8:16 UTC (permalink / raw)
  To: dev

ring debug stat won't take care non-EAL thread.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 v5 changes:
   check __lcore_id with LCORE_ID_ANY instead of RTE_MAX_LCORE

 lib/librte_ring/rte_ring.h | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index 7cd5f2d..553a880 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -188,10 +188,12 @@ struct rte_ring {
  *   The number to add to the object-oriented statistics.
  */
 #ifdef RTE_LIBRTE_RING_DEBUG
-#define __RING_STAT_ADD(r, name, n) do {		\
-		unsigned __lcore_id = rte_lcore_id();	\
-		r->stats[__lcore_id].name##_objs += n;	\
-		r->stats[__lcore_id].name##_bulk += 1;	\
+#define __RING_STAT_ADD(r, name, n) do {                        \
+		unsigned __lcore_id = rte_lcore_id();           \
+		if (__lcore_id != LCORE_ID_ANY) {               \
+			r->stats[__lcore_id].name##_objs += n;  \
+			r->stats[__lcore_id].name##_bulk += 1;  \
+		}                                               \
 	} while(0)
 #else
 #define __RING_STAT_ADD(r, name, n) do {} while(0)
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v5 18/19] ring: add sched_yield to avoid spin forever
  2015-02-12  8:16         ` [dpdk-dev] [PATCH v5 00/19] " Cunming Liang
                             ` (16 preceding siblings ...)
  2015-02-12  8:16           ` [dpdk-dev] [PATCH v5 17/19] ring: " Cunming Liang
@ 2015-02-12  8:16           ` Cunming Liang
  2015-02-12 11:16             ` Olivier MATZ
  2015-02-12  8:16           ` [dpdk-dev] [PATCH v5 19/19] timer: add support to non-EAL thread Cunming Liang
  2015-02-13  1:38           ` [dpdk-dev] [PATCH v6 00/19] support multi-pthread per core Cunming Liang
  19 siblings, 1 reply; 253+ messages in thread
From: Cunming Liang @ 2015-02-12  8:16 UTC (permalink / raw)
  To: dev

Add a sched_yield() syscall if the thread spins for too long, waiting other thread to finish its operations on the ring.
That gives pre-empted thread a chance to proceed and finish with ring enqnue/dequeue operation.
The purpose is to reduce contention on the ring. By ring_perf_test, it doesn't shows additional perf penalty.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 v5 changes:
   add RTE_RING_PAUSE_REP to config file

 v4 changes:
   update and add more comments on sched_yield()

 v3 changes:
   new patch adding sched_yield() in rte_ring to avoid long spin

 config/common_bsdapp       |  1 +
 config/common_linuxapp     |  1 +
 lib/librte_ring/rte_ring.h | 31 +++++++++++++++++++++++++++----
 3 files changed, 29 insertions(+), 4 deletions(-)

diff --git a/config/common_bsdapp b/config/common_bsdapp
index 57bacb8..52c5143 100644
--- a/config/common_bsdapp
+++ b/config/common_bsdapp
@@ -234,6 +234,7 @@ CONFIG_RTE_PMD_PACKET_PREFETCH=y
 CONFIG_RTE_LIBRTE_RING=y
 CONFIG_RTE_LIBRTE_RING_DEBUG=n
 CONFIG_RTE_RING_SPLIT_PROD_CONS=n
+CONFIG_RTE_RING_PAUSE_REP=n
 
 #
 # Compile librte_mempool
diff --git a/config/common_linuxapp b/config/common_linuxapp
index d428f84..0b4eb3c 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -242,6 +242,7 @@ CONFIG_RTE_PMD_PACKET_PREFETCH=y
 CONFIG_RTE_LIBRTE_RING=y
 CONFIG_RTE_LIBRTE_RING_DEBUG=n
 CONFIG_RTE_RING_SPLIT_PROD_CONS=n
+CONFIG_RTE_RING_PAUSE_REP=n
 
 #
 # Compile librte_mempool
diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index 553a880..eecba0b 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -127,6 +127,11 @@ struct rte_ring_debug_stats {
 #define RTE_RING_NAMESIZE 32 /**< The maximum length of a ring name. */
 #define RTE_RING_MZ_PREFIX "RG_"
 
+#ifndef RTE_RING_PAUSE_REP
+#define RTE_RING_PAUSE_REP 0 /**< yield after pause num of times,
+			      * no yield if RTE_RING_PAUSE_REP not defined. */
+#endif
+
 /**
  * An RTE ring structure.
  *
@@ -410,7 +415,7 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	uint32_t cons_tail, free_entries;
 	const unsigned max = n;
 	int success;
-	unsigned i;
+	unsigned i, rep = 0;
 	uint32_t mask = r->prod.mask;
 	int ret;
 
@@ -468,9 +473,18 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	 * If there are other enqueues in progress that preceded us,
 	 * we need to wait for them to complete
 	 */
-	while (unlikely(r->prod.tail != prod_head))
+	while (unlikely(r->prod.tail != prod_head)) {
 		rte_pause();
 
+		/* Set RTE_RING_PAUSE_REP to avoid spin too long waiting for
+		 * other thread finish. It gives pre-empted thread a chance
+		 * to proceed and finish with ring denqnue operation. */
+		if (RTE_RING_PAUSE_REP &&
+		    ++rep == RTE_RING_PAUSE_REP) {
+			rep = 0;
+			sched_yield();
+		}
+	}
 	r->prod.tail = prod_next;
 	return ret;
 }
@@ -589,7 +603,7 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 	uint32_t cons_next, entries;
 	const unsigned max = n;
 	int success;
-	unsigned i;
+	unsigned i, rep = 0;
 	uint32_t mask = r->prod.mask;
 
 	/* move cons.head atomically */
@@ -634,9 +648,18 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 	 * If there are other dequeues in progress that preceded us,
 	 * we need to wait for them to complete
 	 */
-	while (unlikely(r->cons.tail != cons_head))
+	while (unlikely(r->cons.tail != cons_head)) {
 		rte_pause();
 
+		/* Set RTE_RING_PAUSE_REP to avoid spin too long waiting for
+		 * other thread finish. It gives pre-empted thread a chance
+		 * to proceed and finish with ring denqnue operation. */
+		if (RTE_RING_PAUSE_REP &&
+		    ++rep == RTE_RING_PAUSE_REP) {
+			rep = 0;
+			sched_yield();
+		}
+	}
 	__RING_STAT_ADD(r, deq_success, n);
 	r->cons.tail = cons_next;
 
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* [dpdk-dev] [PATCH v5 19/19] timer: add support to non-EAL thread
  2015-02-12  8:16         ` [dpdk-dev] [PATCH v5 00/19] " Cunming Liang
                             ` (17 preceding siblings ...)
  2015-02-12  8:16           ` [dpdk-dev] [PATCH v5 18/19] ring: add sched_yield to avoid spin forever Cunming Liang
@ 2015-02-12  8:16           ` Cunming Liang
  2015-02-12 13:54             ` Ananyev, Konstantin
  2015-02-13  1:38           ` [dpdk-dev] [PATCH v6 00/19] support multi-pthread per core Cunming Liang
  19 siblings, 1 reply; 253+ messages in thread
From: Cunming Liang @ 2015-02-12  8:16 UTC (permalink / raw)
  To: dev

Allow to setup timers only for EAL (lcore) threads (__lcore_id < MAX_LCORE_ID).
E.g. – dynamically created thread will be able to reset/stop timer for lcore thread,
but it will be not allowed to setup timer for itself or another non-lcore thread.
rte_timer_manage() for non-lcore thread would simply do nothing and return straightway.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 v5 changes:
   add assert in rte_timer_manage
   remove duplicate check in timer_set_config_state

 lib/librte_timer/rte_timer.c | 31 ++++++++++++++++++++++---------
 lib/librte_timer/rte_timer.h |  4 ++--
 2 files changed, 24 insertions(+), 11 deletions(-)

diff --git a/lib/librte_timer/rte_timer.c b/lib/librte_timer/rte_timer.c
index 269a992..fa43fa9 100644
--- a/lib/librte_timer/rte_timer.c
+++ b/lib/librte_timer/rte_timer.c
@@ -35,6 +35,7 @@
 #include <stdio.h>
 #include <stdint.h>
 #include <inttypes.h>
+#include <assert.h>
 
 #include <rte_atomic.h>
 #include <rte_common.h>
@@ -79,9 +80,10 @@ static struct priv_timer priv_timer[RTE_MAX_LCORE];
 
 /* when debug is enabled, store some statistics */
 #ifdef RTE_LIBRTE_TIMER_DEBUG
-#define __TIMER_STAT_ADD(name, n) do {				\
-		unsigned __lcore_id = rte_lcore_id();		\
-		priv_timer[__lcore_id].stats.name += (n);	\
+#define __TIMER_STAT_ADD(name, n) do {					\
+		unsigned __lcore_id = rte_lcore_id();			\
+		if (__lcore_id < RTE_MAX_LCORE)				\
+			priv_timer[__lcore_id].stats.name += (n);	\
 	} while(0)
 #else
 #define __TIMER_STAT_ADD(name, n) do {} while(0)
@@ -135,7 +137,7 @@ timer_set_config_state(struct rte_timer *tim,
 
 		/* timer is running on another core, exit */
 		if (prev_status.state == RTE_TIMER_RUNNING &&
-		    (unsigned)prev_status.owner != lcore_id)
+		    prev_status.owner != (uint16_t)lcore_id)
 			return -1;
 
 		/* timer is being configured on another core */
@@ -366,9 +368,15 @@ __rte_timer_reset(struct rte_timer *tim, uint64_t expire,
 
 	/* round robin for tim_lcore */
 	if (tim_lcore == (unsigned)LCORE_ID_ANY) {
-		tim_lcore = rte_get_next_lcore(priv_timer[lcore_id].prev_lcore,
-					       0, 1);
-		priv_timer[lcore_id].prev_lcore = tim_lcore;
+		if (lcore_id != LCORE_ID_ANY) {
+			tim_lcore = rte_get_next_lcore(
+				priv_timer[lcore_id].prev_lcore,
+				0, 1);
+			priv_timer[lcore_id].prev_lcore = tim_lcore;
+		} else
+			/* non-EAL thread do not run rte_timer_manage(),
+			 * so schedule the timer on the first enabled lcore. */
+			tim_lcore = rte_get_next_lcore(LCORE_ID_ANY, 0, 1);
 	}
 
 	/* wait that the timer is in correct status before update,
@@ -378,7 +386,8 @@ __rte_timer_reset(struct rte_timer *tim, uint64_t expire,
 		return -1;
 
 	__TIMER_STAT_ADD(reset, 1);
-	if (prev_status.state == RTE_TIMER_RUNNING) {
+	if (prev_status.state == RTE_TIMER_RUNNING &&
+	    lcore_id != LCORE_ID_ANY) {
 		priv_timer[lcore_id].updated = 1;
 	}
 
@@ -455,7 +464,8 @@ rte_timer_stop(struct rte_timer *tim)
 		return -1;
 
 	__TIMER_STAT_ADD(stop, 1);
-	if (prev_status.state == RTE_TIMER_RUNNING) {
+	if (prev_status.state == RTE_TIMER_RUNNING &&
+	    lcore_id != LCORE_ID_ANY) {
 		priv_timer[lcore_id].updated = 1;
 	}
 
@@ -499,6 +509,9 @@ void rte_timer_manage(void)
 	uint64_t cur_time;
 	int i, ret;
 
+	/* timer manager only runs on EAL thread */
+	assert(lcore_id != LCORE_ID_ANY);
+
 	__TIMER_STAT_ADD(manage, 1);
 	/* optimize for the case where per-cpu list is empty */
 	if (priv_timer[lcore_id].pending_head.sl_next[0] == NULL)
diff --git a/lib/librte_timer/rte_timer.h b/lib/librte_timer/rte_timer.h
index 4907cf5..35b8719 100644
--- a/lib/librte_timer/rte_timer.h
+++ b/lib/librte_timer/rte_timer.h
@@ -76,7 +76,7 @@ extern "C" {
 #define RTE_TIMER_RUNNING 2 /**< State: timer function is running. */
 #define RTE_TIMER_CONFIG  3 /**< State: timer is being configured. */
 
-#define RTE_TIMER_NO_OWNER -1 /**< Timer has no owner. */
+#define RTE_TIMER_NO_OWNER -2 /**< Timer has no owner. */
 
 /**
  * Timer type: Periodic or single (one-shot).
@@ -310,7 +310,7 @@ int rte_timer_pending(struct rte_timer *tim);
 /**
  * Manage the timer list and execute callback functions.
  *
- * This function must be called periodically from all cores
+ * This function must be called periodically from EAL lcores
  * main_loop(). It browses the list of pending timers and runs all
  * timers that are expired.
  *
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 253+ messages in thread

* Re: [dpdk-dev] [PATCH v5 18/19] ring: add sched_yield to avoid spin forever
  2015-02-12  8:16           ` [dpdk-dev] [PATCH v5 18/19] ring: add sched_yield to avoid spin forever Cunming Liang
@ 2015-02-12 11:16             ` Olivier MATZ
  2015-02-12 13:05               ` Liang, Cunming
  0 siblings, 1 reply; 253+ messages in thread
From: Olivier MATZ @ 2015-02-12 11:16 UTC (permalink / raw)
  To: Cunming Liang, dev

Hi,

On 02/12/2015 09:16 AM, Cunming Liang wrote:
> Add a sched_yield() syscall if the thread spins for too long, waiting other thread to finish its operations on the ring.
> That gives pre-empted thread a chance to proceed and finish with ring enqnue/dequeue operation.
> The purpose is to reduce contention on the ring. By ring_perf_test, it doesn't shows additional perf penalty.
>
> Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> ---
>   v5 changes:
>     add RTE_RING_PAUSE_REP to config file
>
>   v4 changes:
>     update and add more comments on sched_yield()
>
>   v3 changes:
>     new patch adding sched_yield() in rte_ring to avoid long spin
>
>   config/common_bsdapp       |  1 +
>   config/common_linuxapp     |  1 +
>   lib/librte_ring/rte_ring.h | 31 +++++++++++++++++++++++++++----
>   3 files changed, 29 insertions(+), 4 deletions(-)
>
> diff --git a/config/common_bsdapp b/config/common_bsdapp
> index 57bacb8..52c5143 100644
> --- a/config/common_bsdapp
> +++ b/config/common_bsdapp
> @@ -234,6 +234,7 @@ CONFIG_RTE_PMD_PACKET_PREFETCH=y
>   CONFIG_RTE_LIBRTE_RING=y
>   CONFIG_RTE_LIBRTE_RING_DEBUG=n
>   CONFIG_RTE_RING_SPLIT_PROD_CONS=n
> +CONFIG_RTE_RING_PAUSE_REP=n

Maybe it's better to use CONFIG_RTE_RING_PAUSE_REP=0 instead?
If I understand well, it has to be set to an integer value to
enable it, am I correct?

Thanks,
Olivier