DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH 0/8] Remove IPC threads
@ 2018-06-15 14:25 Anatoly Burakov
  2018-06-15 14:25 ` [dpdk-dev] [PATCH 1/8] eal/linux: use glibc malloc in alarm Anatoly Burakov
                   ` (17 more replies)
  0 siblings, 18 replies; 25+ messages in thread
From: Anatoly Burakov @ 2018-06-15 14:25 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, thomas, bruce.richardson

As previously discussed [1], IPC threads need to be removed and their
workload moved to interrupt thread.

FreeBSD did not have an interrupt thread, nor did it support alarm
API. This patchset adds support for both on FreeBSD. FreeBSD interrupt
thread is based on kevent, FreeBSD's native event multiplexing
mechanism similar to Linux's epoll.

The patchset makes FreeBSD's interrupts and alarm work just enough to
suffice for purposes of IPC, however there are really weird problems
observed. Specifically, FreeBSD's kevent timers are really slow to
trigger for some reason, sleeping on a 10ms timer as much as 200ms
before waking up. Interrupt handling on fd's is also a bit flaky.

It has also been observed that both problems go away if we do not
affinitize master lcore (by commenting relevant code out [2]). It is
not known why these problems are observed, nor it is clear what a
solution might entail.

For the purposes of making IPC work and having rudimentary support
for alarm and interrupt API's, this patchset works fine. However,
because of the above described issues, documentation will not be
updated to indicate support for interrupts on FreeBSD at this time.

[1] http://dpdk.org/dev/patchwork/patch/36579/
[2] http://dpdk.org/browse/dpdk/tree/lib/librte_eal/bsdapp/eal/eal.c#n729

Anatoly Burakov (4):
  ipc: remove IPC thread for async requests
  eal/bsdapp: add interrupt thread
  eal/bsdapp: add alarm support
  ipc: remove main IPC thread

Jianfeng Tan (4):
  eal/linux: use glibc malloc in alarm
  eal/linux: use glibc malloc in interrupt handling
  eal: bring forward init of interrupt handling
  eal: add IPC type for interrupt thread

 lib/librte_eal/bsdapp/eal/eal.c               |  10 +-
 lib/librte_eal/bsdapp/eal/eal_alarm.c         | 299 +++++++++++-
 lib/librte_eal/bsdapp/eal/eal_alarm_private.h |  19 +
 lib/librte_eal/bsdapp/eal/eal_interrupts.c    | 460 +++++++++++++++++-
 lib/librte_eal/common/eal_common_proc.c       | 243 ++++-----
 .../common/include/rte_eal_interrupts.h       |   1 +
 lib/librte_eal/linuxapp/eal/eal.c             |  10 +-
 lib/librte_eal/linuxapp/eal/eal_alarm.c       |   9 +-
 lib/librte_eal/linuxapp/eal/eal_interrupts.c  |  19 +-
 test/test/test_interrupts.c                   |  29 +-
 10 files changed, 899 insertions(+), 200 deletions(-)
 create mode 100644 lib/librte_eal/bsdapp/eal/eal_alarm_private.h

-- 
2.17.1

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [dpdk-dev] [PATCH 1/8] eal/linux: use glibc malloc in alarm
  2018-06-15 14:25 [dpdk-dev] [PATCH 0/8] Remove IPC threads Anatoly Burakov
@ 2018-06-15 14:25 ` Anatoly Burakov
  2018-06-15 14:25 ` [dpdk-dev] [PATCH 2/8] eal/linux: use glibc malloc in interrupt handling Anatoly Burakov
                   ` (16 subsequent siblings)
  17 siblings, 0 replies; 25+ messages in thread
From: Anatoly Burakov @ 2018-06-15 14:25 UTC (permalink / raw)
  To: dev; +Cc: Jianfeng Tan, konstantin.ananyev, thomas, bruce.richardson

From: Jianfeng Tan <jianfeng.tan@intel.com>

IPC is used by memory subsystem, and when we merge IPC threads with
interrupt threads, asynchronous IPC requests will have to use alarm
API. However, currently, alarm API uses rte_malloc to allocate memory,
which will create a circular dependency between alarm API and memory
subsystem.

To avoid such chicken and egg dilemma, we change to use glibc malloc
in the alarm API.

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
 lib/librte_eal/linuxapp/eal/eal_alarm.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_alarm.c b/lib/librte_eal/linuxapp/eal/eal_alarm.c
index c115e823a..391d2a65f 100644
--- a/lib/librte_eal/linuxapp/eal/eal_alarm.c
+++ b/lib/librte_eal/linuxapp/eal/eal_alarm.c
@@ -19,7 +19,6 @@
 #include <rte_launch.h>
 #include <rte_lcore.h>
 #include <rte_errno.h>
-#include <rte_malloc.h>
 #include <rte_spinlock.h>
 #include <eal_private.h>
 
@@ -91,7 +90,7 @@ eal_alarm_callback(void *arg __rte_unused)
 		rte_spinlock_lock(&alarm_list_lk);
 
 		LIST_REMOVE(ap, next);
-		rte_free(ap);
+		free(ap);
 	}
 
 	if (!LIST_EMPTY(&alarm_list)) {
@@ -122,7 +121,7 @@ rte_eal_alarm_set(uint64_t us, rte_eal_alarm_callback cb_fn, void *cb_arg)
 	if (us < 1 || us > (UINT64_MAX - US_PER_S) || cb_fn == NULL)
 		return -EINVAL;
 
-	new_alarm = rte_zmalloc(NULL, sizeof(*new_alarm), 0);
+	new_alarm = calloc(1, sizeof(*new_alarm));
 	if (new_alarm == NULL)
 		return -ENOMEM;
 
@@ -196,7 +195,7 @@ rte_eal_alarm_cancel(rte_eal_alarm_callback cb_fn, void *cb_arg)
 
 			if (ap->executing == 0) {
 				LIST_REMOVE(ap, next);
-				rte_free(ap);
+				free(ap);
 				count++;
 			} else {
 				/* If calling from other context, mark that alarm is executing
@@ -220,7 +219,7 @@ rte_eal_alarm_cancel(rte_eal_alarm_callback cb_fn, void *cb_arg)
 
 				if (ap->executing == 0) {
 					LIST_REMOVE(ap, next);
-					rte_free(ap);
+					free(ap);
 					count++;
 					ap = ap_prev;
 				} else if (pthread_equal(ap->executing_id, pthread_self()) == 0)
-- 
2.17.1

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [dpdk-dev] [PATCH 2/8] eal/linux: use glibc malloc in interrupt handling
  2018-06-15 14:25 [dpdk-dev] [PATCH 0/8] Remove IPC threads Anatoly Burakov
  2018-06-15 14:25 ` [dpdk-dev] [PATCH 1/8] eal/linux: use glibc malloc in alarm Anatoly Burakov
@ 2018-06-15 14:25 ` Anatoly Burakov
  2018-06-15 14:25 ` [dpdk-dev] [PATCH 3/8] ipc: remove IPC thread for async requests Anatoly Burakov
                   ` (15 subsequent siblings)
  17 siblings, 0 replies; 25+ messages in thread
From: Anatoly Burakov @ 2018-06-15 14:25 UTC (permalink / raw)
  To: dev; +Cc: Jianfeng Tan, konstantin.ananyev, thomas, bruce.richardson

From: Jianfeng Tan <jianfeng.tan@intel.com>

IPC is used by memory subsystem, and thus it should not use rte_malloc to
avoid circular dependency. Switch to using regular glibc malloc in
interrupts API.

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
 lib/librte_eal/linuxapp/eal/eal_interrupts.c | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
index 056d41c12..180c0378a 100644
--- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
@@ -30,7 +30,6 @@
 #include <rte_branch_prediction.h>
 #include <rte_debug.h>
 #include <rte_log.h>
-#include <rte_malloc.h>
 #include <rte_errno.h>
 #include <rte_spinlock.h>
 #include <rte_pause.h>
@@ -405,8 +404,7 @@ rte_intr_callback_register(const struct rte_intr_handle *intr_handle,
 	}
 
 	/* allocate a new interrupt callback entity */
-	callback = rte_zmalloc("interrupt callback list",
-				sizeof(*callback), 0);
+	callback = calloc(1, sizeof(*callback));
 	if (callback == NULL) {
 		RTE_LOG(ERR, EAL, "Can not allocate memory\n");
 		return -ENOMEM;
@@ -431,10 +429,10 @@ rte_intr_callback_register(const struct rte_intr_handle *intr_handle,
 
 	/* no existing callbacks for this - add new source */
 	if (src == NULL) {
-		if ((src = rte_zmalloc("interrupt source list",
-				sizeof(*src), 0)) == NULL) {
+		src = calloc(1, sizeof(*src));
+		if (src == NULL) {
 			RTE_LOG(ERR, EAL, "Can not allocate memory\n");
-			rte_free(callback);
+			free(callback);
 			ret = -ENOMEM;
 		} else {
 			src->intr_handle = *intr_handle;
@@ -501,7 +499,7 @@ rte_intr_callback_unregister(const struct rte_intr_handle *intr_handle,
 			if (cb->cb_fn == cb_fn && (cb_arg == (void *)-1 ||
 					cb->cb_arg == cb_arg)) {
 				TAILQ_REMOVE(&src->callbacks, cb, next);
-				rte_free(cb);
+				free(cb);
 				ret++;
 			}
 		}
@@ -509,7 +507,7 @@ rte_intr_callback_unregister(const struct rte_intr_handle *intr_handle,
 		/* all callbacks for that source are removed. */
 		if (TAILQ_EMPTY(&src->callbacks)) {
 			TAILQ_REMOVE(&intr_sources, src, next);
-			rte_free(src);
+			free(src);
 		}
 	}
 
-- 
2.17.1

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [dpdk-dev] [PATCH 3/8] ipc: remove IPC thread for async requests
  2018-06-15 14:25 [dpdk-dev] [PATCH 0/8] Remove IPC threads Anatoly Burakov
  2018-06-15 14:25 ` [dpdk-dev] [PATCH 1/8] eal/linux: use glibc malloc in alarm Anatoly Burakov
  2018-06-15 14:25 ` [dpdk-dev] [PATCH 2/8] eal/linux: use glibc malloc in interrupt handling Anatoly Burakov
@ 2018-06-15 14:25 ` Anatoly Burakov
  2018-06-15 14:25 ` [dpdk-dev] [PATCH 4/8] eal: bring forward init of interrupt handling Anatoly Burakov
                   ` (14 subsequent siblings)
  17 siblings, 0 replies; 25+ messages in thread
From: Anatoly Burakov @ 2018-06-15 14:25 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, thomas, bruce.richardson, Jianfeng Tan

Previously, we were using two IPC threads - one to handle messages
and synchronous requests, and another to handle asynchronous requests.
To handle replies for an async request, rte_mp_handle woke up the
rte_mp_handle_async thread to process through pthread_cond variable.

Change it to handle asynchronous messages within the main IPC thread.
To handle timeout events, for each async request which is sent,
we set an alarm for it. If its reply is received before timeout,
we will cancel the alarm when we handle the reply; otherwise,
alarm will invoke the async_reply_handle() as the alarm callback.

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Suggested-by: Thomas Monjalon <thomas@monjalon.net>
---

Notes:
    RFC->RFCv2:
    - Rebased on latest code
    - Implemented comments to the original RFC

 lib/librte_eal/common/eal_common_proc.c | 201 +++++++++---------------
 1 file changed, 70 insertions(+), 131 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_proc.c b/lib/librte_eal/common/eal_common_proc.c
index 707d8ab30..6f3366403 100644
--- a/lib/librte_eal/common/eal_common_proc.c
+++ b/lib/librte_eal/common/eal_common_proc.c
@@ -20,6 +20,7 @@
 #include <sys/un.h>
 #include <unistd.h>
 
+#include <rte_alarm.h>
 #include <rte_common.h>
 #include <rte_cycles.h>
 #include <rte_eal.h>
@@ -94,11 +95,9 @@ TAILQ_HEAD(pending_request_list, pending_request);
 static struct {
 	struct pending_request_list requests;
 	pthread_mutex_t lock;
-	pthread_cond_t async_cond;
 } pending_requests = {
 	.requests = TAILQ_HEAD_INITIALIZER(pending_requests.requests),
 	.lock = PTHREAD_MUTEX_INITIALIZER,
-	.async_cond = PTHREAD_COND_INITIALIZER
 	/**< used in async requests only */
 };
 
@@ -106,6 +105,16 @@ static struct {
 static int
 mp_send(struct rte_mp_msg *msg, const char *peer, int type);
 
+/* for use with alarm callback */
+static void
+async_reply_handle(void *arg);
+
+/* for use with process_msg */
+static struct pending_request *
+async_reply_handle_thread_unsafe(void *arg);
+
+static void
+trigger_async_action(struct pending_request *req);
 
 static struct pending_request *
 find_pending_request(const char *dst, const char *act_name)
@@ -290,6 +299,8 @@ process_msg(struct mp_msg_internal *m, struct sockaddr_un *s)
 	RTE_LOG(DEBUG, EAL, "msg: %s\n", msg->name);
 
 	if (m->type == MP_REP || m->type == MP_IGN) {
+		struct pending_request *req = NULL;
+
 		pthread_mutex_lock(&pending_requests.lock);
 		pending_req = find_pending_request(s->sun_path, msg->name);
 		if (pending_req) {
@@ -301,11 +312,14 @@ process_msg(struct mp_msg_internal *m, struct sockaddr_un *s)
 			if (pending_req->type == REQUEST_TYPE_SYNC)
 				pthread_cond_signal(&pending_req->sync.cond);
 			else if (pending_req->type == REQUEST_TYPE_ASYNC)
-				pthread_cond_signal(
-					&pending_requests.async_cond);
+				req = async_reply_handle_thread_unsafe(
+						pending_req);
 		} else
 			RTE_LOG(ERR, EAL, "Drop mp reply: %s\n", msg->name);
 		pthread_mutex_unlock(&pending_requests.lock);
+
+		if (req != NULL)
+			trigger_async_action(req);
 		return;
 	}
 
@@ -365,7 +379,6 @@ timespec_cmp(const struct timespec *a, const struct timespec *b)
 }
 
 enum async_action {
-	ACTION_NONE, /**< don't do anything */
 	ACTION_FREE, /**< free the action entry, but don't trigger callback */
 	ACTION_TRIGGER /**< trigger callback, then free action entry */
 };
@@ -375,7 +388,7 @@ process_async_request(struct pending_request *sr, const struct timespec *now)
 {
 	struct async_request_param *param;
 	struct rte_mp_reply *reply;
-	bool timeout, received, last_msg;
+	bool timeout, last_msg;
 
 	param = sr->async.param;
 	reply = &param->user_reply;
@@ -383,13 +396,6 @@ process_async_request(struct pending_request *sr, const struct timespec *now)
 	/* did we timeout? */
 	timeout = timespec_cmp(&param->end, now) <= 0;
 
-	/* did we receive a response? */
-	received = sr->reply_received != 0;
-
-	/* if we didn't time out, and we didn't receive a response, ignore */
-	if (!timeout && !received)
-		return ACTION_NONE;
-
 	/* if we received a response, adjust relevant data and copy mesasge. */
 	if (sr->reply_received == 1 && sr->reply) {
 		struct rte_mp_msg *msg, *user_msgs, *tmp;
@@ -448,120 +454,60 @@ trigger_async_action(struct pending_request *sr)
 	free(sr->async.param->user_reply.msgs);
 	free(sr->async.param);
 	free(sr->request);
+	free(sr);
 }
 
 static struct pending_request *
-check_trigger(struct timespec *ts)
+async_reply_handle_thread_unsafe(void *arg)
 {
-	struct pending_request *next, *cur, *trigger = NULL;
-
-	TAILQ_FOREACH_SAFE(cur, &pending_requests.requests, next, next) {
-		enum async_action action;
-		if (cur->type != REQUEST_TYPE_ASYNC)
-			continue;
-
-		action = process_async_request(cur, ts);
-		if (action == ACTION_FREE) {
-			TAILQ_REMOVE(&pending_requests.requests, cur, next);
-			free(cur);
-		} else if (action == ACTION_TRIGGER) {
-			TAILQ_REMOVE(&pending_requests.requests, cur, next);
-			trigger = cur;
-			break;
-		}
-	}
-	return trigger;
-}
-
-static void
-wait_for_async_messages(void)
-{
-	struct pending_request *sr;
-	struct timespec timeout;
-	bool timedwait = false;
-	bool nowait = false;
-	int ret;
-
-	/* scan through the list and see if there are any timeouts that
-	 * are earlier than our current timeout.
-	 */
-	TAILQ_FOREACH(sr, &pending_requests.requests, next) {
-		if (sr->type != REQUEST_TYPE_ASYNC)
-			continue;
-		if (!timedwait || timespec_cmp(&sr->async.param->end,
-				&timeout) < 0) {
-			memcpy(&timeout, &sr->async.param->end,
-				sizeof(timeout));
-			timedwait = true;
-		}
-
-		/* sometimes, we don't even wait */
-		if (sr->reply_received) {
-			nowait = true;
-			break;
-		}
-	}
-
-	if (nowait)
-		return;
-
-	do {
-		ret = timedwait ?
-			pthread_cond_timedwait(
-				&pending_requests.async_cond,
-				&pending_requests.lock,
-				&timeout) :
-			pthread_cond_wait(
-				&pending_requests.async_cond,
-				&pending_requests.lock);
-	} while (ret != 0 && ret != ETIMEDOUT);
-
-	/* we've been woken up or timed out */
-}
-
-static void *
-async_reply_handle(void *arg __rte_unused)
-{
-	struct timeval now;
+	struct pending_request *req = (struct pending_request *)arg;
+	enum async_action action;
 	struct timespec ts_now;
-	while (1) {
-		struct pending_request *trigger = NULL;
+	struct timeval now;
 
-		pthread_mutex_lock(&pending_requests.lock);
+	if (gettimeofday(&now, NULL) < 0) {
+		RTE_LOG(ERR, EAL, "Cannot get current time\n");
+		goto no_trigger;
+	}
+	ts_now.tv_nsec = now.tv_usec * 1000;
+	ts_now.tv_sec = now.tv_sec;
 
-		/* we exit this function holding the lock */
-		wait_for_async_messages();
+	action = process_async_request(req, &ts_now);
 
-		if (gettimeofday(&now, NULL) < 0) {
-			pthread_mutex_unlock(&pending_requests.lock);
-			RTE_LOG(ERR, EAL, "Cannot get current time\n");
-			break;
+	TAILQ_REMOVE(&pending_requests.requests, req, next);
+
+	if (rte_eal_alarm_cancel(async_reply_handle, req) < 0) {
+		/* if we failed to cancel the alarm because it's already in
+		 * progress, don't proceed because otherwise we will end up
+		 * handling the same message twice.
+		 */
+		if (rte_errno == EINPROGRESS) {
+			RTE_LOG(DEBUG, EAL, "Request handling is already in progress\n");
+			goto no_trigger;
 		}
-		ts_now.tv_nsec = now.tv_usec * 1000;
-		ts_now.tv_sec = now.tv_sec;
-
-		do {
-			trigger = check_trigger(&ts_now);
-			/* unlock request list */
-			pthread_mutex_unlock(&pending_requests.lock);
-
-			if (trigger) {
-				trigger_async_action(trigger);
-				free(trigger);
-
-				/* we've triggered a callback, but there may be
-				 * more, so lock the list and check again.
-				 */
-				pthread_mutex_lock(&pending_requests.lock);
-			}
-		} while (trigger);
+		RTE_LOG(ERR, EAL, "Failed to cancel alarm\n");
 	}
 
-	RTE_LOG(ERR, EAL, "ERROR: asynchronous requests disabled\n");
-
+	if (action == ACTION_TRIGGER)
+		return req;
+no_trigger:
+	free(req);
 	return NULL;
 }
 
+static void
+async_reply_handle(void *arg)
+{
+	struct pending_request *req;
+
+	pthread_mutex_lock(&pending_requests.lock);
+	req = async_reply_handle_thread_unsafe(arg);
+	pthread_mutex_unlock(&pending_requests.lock);
+
+	if (req != NULL)
+		trigger_async_action(req);
+}
+
 static int
 open_socket_fd(void)
 {
@@ -624,7 +570,7 @@ rte_mp_channel_init(void)
 {
 	char path[PATH_MAX];
 	int dir_fd;
-	pthread_t mp_handle_tid, async_reply_handle_tid;
+	pthread_t mp_handle_tid;
 
 	/* create filter path */
 	create_socket_path("*", path, sizeof(path));
@@ -671,17 +617,6 @@ rte_mp_channel_init(void)
 		return -1;
 	}
 
-	if (rte_ctrl_thread_create(&async_reply_handle_tid,
-			"rte_mp_async", NULL,
-			async_reply_handle, NULL) < 0) {
-		RTE_LOG(ERR, EAL, "failed to create mp thead: %s\n",
-			strerror(errno));
-		close(mp_fd);
-		close(dir_fd);
-		mp_fd = -1;
-		return -1;
-	}
-
 	/* unlock the directory */
 	flock(dir_fd, LOCK_UN);
 	close(dir_fd);
@@ -853,7 +788,7 @@ rte_mp_sendmsg(struct rte_mp_msg *msg)
 
 static int
 mp_request_async(const char *dst, struct rte_mp_msg *req,
-		struct async_request_param *param)
+		struct async_request_param *param, const struct timespec *ts)
 {
 	struct rte_mp_msg *reply_msg;
 	struct pending_request *pending_req, *exist;
@@ -898,6 +833,13 @@ mp_request_async(const char *dst, struct rte_mp_msg *req,
 
 	param->user_reply.nb_sent++;
 
+	if (rte_eal_alarm_set(ts->tv_sec * 1000000 + ts->tv_nsec / 1000,
+			      async_reply_handle, pending_req) < 0) {
+		RTE_LOG(ERR, EAL, "Fail to set alarm for request %s:%s\n",
+			dst, req->name);
+		rte_panic("Fix the above shit to properly free all memory\n");
+	}
+
 	return 0;
 fail:
 	free(pending_req);
@@ -1119,7 +1061,7 @@ rte_mp_request_async(struct rte_mp_msg *req, const struct timespec *ts,
 
 	/* for secondary process, send request to the primary process only */
 	if (rte_eal_process_type() == RTE_PROC_SECONDARY) {
-		ret = mp_request_async(eal_mp_socket_path(), copy, param);
+		ret = mp_request_async(eal_mp_socket_path(), copy, param, ts);
 
 		/* if we didn't send anything, put dummy request on the queue */
 		if (ret == 0 && reply->nb_sent == 0) {
@@ -1162,7 +1104,7 @@ rte_mp_request_async(struct rte_mp_msg *req, const struct timespec *ts,
 		snprintf(path, sizeof(path), "%s/%s", mp_dir_path,
 			 ent->d_name);
 
-		if (mp_request_async(path, copy, param))
+		if (mp_request_async(path, copy, param, ts))
 			ret = -1;
 	}
 	/* if we didn't send anything, put dummy request on the queue */
@@ -1171,9 +1113,6 @@ rte_mp_request_async(struct rte_mp_msg *req, const struct timespec *ts,
 		dummy_used = true;
 	}
 
-	/* trigger async request thread wake up */
-	pthread_cond_signal(&pending_requests.async_cond);
-
 	/* finally, unlock the queue */
 	pthread_mutex_unlock(&pending_requests.lock);
 
-- 
2.17.1

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [dpdk-dev] [PATCH 4/8] eal: bring forward init of interrupt handling
  2018-06-15 14:25 [dpdk-dev] [PATCH 0/8] Remove IPC threads Anatoly Burakov
                   ` (2 preceding siblings ...)
  2018-06-15 14:25 ` [dpdk-dev] [PATCH 3/8] ipc: remove IPC thread for async requests Anatoly Burakov
@ 2018-06-15 14:25 ` Anatoly Burakov
  2018-06-15 14:25 ` [dpdk-dev] [PATCH 5/8] eal: add IPC type for interrupt thread Anatoly Burakov
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 25+ messages in thread
From: Anatoly Burakov @ 2018-06-15 14:25 UTC (permalink / raw)
  To: dev; +Cc: Jianfeng Tan, Bruce Richardson, konstantin.ananyev, thomas

From: Jianfeng Tan <jianfeng.tan@intel.com>

IPC will reply on interrupt handling, so we move forward the init
of interrupt handling.

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
 lib/librte_eal/bsdapp/eal/eal.c   | 10 +++++-----
 lib/librte_eal/linuxapp/eal/eal.c | 10 +++++-----
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
index dc279542d..f70f7aecd 100644
--- a/lib/librte_eal/bsdapp/eal/eal.c
+++ b/lib/librte_eal/bsdapp/eal/eal.c
@@ -625,6 +625,11 @@ rte_eal_init(int argc, char **argv)
 
 	rte_config_init();
 
+	if (rte_eal_intr_init() < 0) {
+		rte_eal_init_alert("Cannot init interrupt-handling thread\n");
+		return -1;
+	}
+
 	/* Put mp channel init before bus scan so that we can init the vdev
 	 * bus through mp channel in the secondary process before the bus scan.
 	 */
@@ -713,11 +718,6 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
-	if (rte_eal_intr_init() < 0) {
-		rte_eal_init_alert("Cannot init interrupt-handling thread\n");
-		return -1;
-	}
-
 	if (rte_eal_timer_init() < 0) {
 		rte_eal_init_alert("Cannot init HPET or TSC timers\n");
 		rte_errno = ENOTSUP;
diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 8655b8691..f8a0c06d7 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -839,6 +839,11 @@ rte_eal_init(int argc, char **argv)
 
 	rte_config_init();
 
+	if (rte_eal_intr_init() < 0) {
+		rte_eal_init_alert("Cannot init interrupt-handling thread\n");
+		return -1;
+	}
+
 	/* Put mp channel init before bus scan so that we can init the vdev
 	 * bus through mp channel in the secondary process before the bus scan.
 	 */
@@ -968,11 +973,6 @@ rte_eal_init(int argc, char **argv)
 		rte_config.master_lcore, (int)thread_id, cpuset,
 		ret == 0 ? "" : "...");
 
-	if (rte_eal_intr_init() < 0) {
-		rte_eal_init_alert("Cannot init interrupt-handling thread\n");
-		return -1;
-	}
-
 	RTE_LCORE_FOREACH_SLAVE(i) {
 
 		/*
-- 
2.17.1

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [dpdk-dev] [PATCH 5/8] eal: add IPC type for interrupt thread
  2018-06-15 14:25 [dpdk-dev] [PATCH 0/8] Remove IPC threads Anatoly Burakov
                   ` (3 preceding siblings ...)
  2018-06-15 14:25 ` [dpdk-dev] [PATCH 4/8] eal: bring forward init of interrupt handling Anatoly Burakov
@ 2018-06-15 14:25 ` Anatoly Burakov
  2018-06-15 14:25 ` [dpdk-dev] [PATCH 6/8] eal/bsdapp: add " Anatoly Burakov
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 25+ messages in thread
From: Anatoly Burakov @ 2018-06-15 14:25 UTC (permalink / raw)
  To: dev; +Cc: Jianfeng Tan, konstantin.ananyev, thomas, bruce.richardson

From: Jianfeng Tan <jianfeng.tan@intel.com>

We are going to merge IPC into interrupt thread. This patch adds
IPC type for interrupt thread.

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---

Notes:
    RFC->RFCv2:
    - Fixed typo in test app

 .../common/include/rte_eal_interrupts.h       |  1 +
 lib/librte_eal/linuxapp/eal/eal_interrupts.c  |  5 ++++
 test/test/test_interrupts.c                   | 29 ++++++++++++++++++-
 3 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/common/include/rte_eal_interrupts.h b/lib/librte_eal/common/include/rte_eal_interrupts.h
index 6eb493271..344db768d 100644
--- a/lib/librte_eal/common/include/rte_eal_interrupts.h
+++ b/lib/librte_eal/common/include/rte_eal_interrupts.h
@@ -35,6 +35,7 @@ enum rte_intr_handle_type {
 	RTE_INTR_HANDLE_EXT,          /**< external handler */
 	RTE_INTR_HANDLE_VDEV,         /**< virtual device */
 	RTE_INTR_HANDLE_DEV_EVENT,    /**< device event handle */
+	RTE_INTR_HANDLE_IPC,          /**< IPC event handle */
 	RTE_INTR_HANDLE_MAX           /**< count of elements */
 };
 
diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
index 180c0378a..390672739 100644
--- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
@@ -560,6 +560,8 @@ rte_intr_enable(const struct rte_intr_handle *intr_handle)
 	/* not used at this moment */
 	case RTE_INTR_HANDLE_DEV_EVENT:
 		return -1;
+	case RTE_INTR_HANDLE_IPC:
+		return -1;
 	/* unknown handle type */
 	default:
 		RTE_LOG(ERR, EAL,
@@ -610,6 +612,8 @@ rte_intr_disable(const struct rte_intr_handle *intr_handle)
 	/* not used at this moment */
 	case RTE_INTR_HANDLE_DEV_EVENT:
 		return -1;
+	case RTE_INTR_HANDLE_IPC:
+		return -1;
 	/* unknown handle type */
 	default:
 		RTE_LOG(ERR, EAL,
@@ -679,6 +683,7 @@ eal_intr_process_interrupts(struct epoll_event *events, int nfds)
 			call = true;
 			break;
 		case RTE_INTR_HANDLE_DEV_EVENT:
+		case RTE_INTR_HANDLE_IPC:
 			bytes_read = 0;
 			call = true;
 			break;
diff --git a/test/test/test_interrupts.c b/test/test/test_interrupts.c
index dc19175d3..fa18ddf75 100644
--- a/test/test/test_interrupts.c
+++ b/test/test/test_interrupts.c
@@ -21,6 +21,7 @@ enum test_interrupt_handle_type {
 	TEST_INTERRUPT_HANDLE_VALID_UIO,
 	TEST_INTERRUPT_HANDLE_VALID_ALARM,
 	TEST_INTERRUPT_HANDLE_VALID_DEV_EVENT,
+	TEST_INTERRUPT_HANDLE_VALID_IPC,
 	TEST_INTERRUPT_HANDLE_CASE1,
 	TEST_INTERRUPT_HANDLE_MAX
 };
@@ -85,6 +86,10 @@ test_interrupt_init(void)
 	intr_handles[TEST_INTERRUPT_HANDLE_VALID_DEV_EVENT].type =
 					RTE_INTR_HANDLE_DEV_EVENT;
 
+	intr_handles[TEST_INTERRUPT_HANDLE_VALID_IPC].fd = pfds.readfd;
+	intr_handles[TEST_INTERRUPT_HANDLE_VALID_IPC].type =
+					RTE_INTR_HANDLE_IPC;
+
 	intr_handles[TEST_INTERRUPT_HANDLE_CASE1].fd = pfds.writefd;
 	intr_handles[TEST_INTERRUPT_HANDLE_CASE1].type = RTE_INTR_HANDLE_UIO;
 
@@ -263,6 +268,14 @@ test_interrupt_enable(void)
 		return -1;
 	}
 
+	/* check with specific valid intr_handle */
+	test_intr_handle = intr_handles[TEST_INTERRUPT_HANDLE_VALID_IPC];
+	if (rte_intr_enable(&test_intr_handle) == 0) {
+		printf("unexpectedly enable a specific intr_handle "
+			"successfully\n");
+		return -1;
+	}
+
 	/* check with valid handler and its type */
 	test_intr_handle = intr_handles[TEST_INTERRUPT_HANDLE_CASE1];
 	if (rte_intr_enable(&test_intr_handle) < 0) {
@@ -327,6 +340,14 @@ test_interrupt_disable(void)
 		return -1;
 	}
 
+	/* check with specific valid intr_handle */
+	test_intr_handle = intr_handles[TEST_INTERRUPT_HANDLE_VALID_IPC];
+	if (rte_intr_disable(&test_intr_handle) == 0) {
+		printf("unexpectedly disable a specific intr_handle "
+			"successfully\n");
+		return -1;
+	}
+
 	/* check with valid handler and its type */
 	test_intr_handle = intr_handles[TEST_INTERRUPT_HANDLE_CASE1];
 	if (rte_intr_disable(&test_intr_handle) < 0) {
@@ -424,7 +445,7 @@ test_interrupt(void)
 
 	printf("Check valid alarm interrupt full path\n");
 	if (test_interrupt_full_path_check(
-		TEST_INTERRUPT_HANDLE_VALID_DEV_EVENT) < 0) {
+		TEST_INTERRUPT_HANDLE_VALID_IPC) < 0) {
 		printf("failure occurred during checking valid alarm "
 						"interrupt full path\n");
 		goto out;
@@ -548,6 +569,12 @@ test_interrupt(void)
 	rte_intr_callback_unregister(&test_intr_handle,
 			test_interrupt_callback_1, (void *)-1);
 
+	test_intr_handle = intr_handles[TEST_INTERRUPT_HANDLE_VALID_IPC];
+	rte_intr_callback_unregister(&test_intr_handle,
+			test_interrupt_callback, (void *)-1);
+	rte_intr_callback_unregister(&test_intr_handle,
+			test_interrupt_callback_1, (void *)-1);
+
 	rte_delay_ms(2 * TEST_INTERRUPT_CHECK_INTERVAL);
 	/* deinit */
 	test_interrupt_deinit();
-- 
2.17.1

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [dpdk-dev] [PATCH 6/8] eal/bsdapp: add interrupt thread
  2018-06-15 14:25 [dpdk-dev] [PATCH 0/8] Remove IPC threads Anatoly Burakov
                   ` (4 preceding siblings ...)
  2018-06-15 14:25 ` [dpdk-dev] [PATCH 5/8] eal: add IPC type for interrupt thread Anatoly Burakov
@ 2018-06-15 14:25 ` Anatoly Burakov
  2018-06-15 14:25 ` [dpdk-dev] [PATCH 7/8] eal/bsdapp: add alarm support Anatoly Burakov
                   ` (11 subsequent siblings)
  17 siblings, 0 replies; 25+ messages in thread
From: Anatoly Burakov @ 2018-06-15 14:25 UTC (permalink / raw)
  To: dev; +Cc: Bruce Richardson, konstantin.ananyev, thomas

Add interrupt thread to FreeBSD. It is largely a copy-paste from
Linuxapp interrupt thread, except for a few key differences:

* Use kevent instead of epoll
* Do not recreate the event queue on adding/removing interrupt
  sources, add/remove them to/from the queue on the fly instead
* No support for UIO/VFIO handles

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
 lib/librte_eal/bsdapp/eal/eal_interrupts.c | 441 ++++++++++++++++++++-
 1 file changed, 423 insertions(+), 18 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal_interrupts.c b/lib/librte_eal/bsdapp/eal/eal_interrupts.c
index 290d53ab9..0fe068894 100644
--- a/lib/librte_eal/bsdapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/bsdapp/eal/eal_interrupts.c
@@ -1,51 +1,456 @@
 /* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(c) 2010-2014 Intel Corporation
+ * Copyright(c) 2018 Intel Corporation
  */
 
+#include <string.h>
+#include <sys/types.h>
+#include <sys/event.h>
+#include <sys/queue.h>
+#include <unistd.h>
+
+#include <rte_errno.h>
+#include <rte_interrupts.h>
+#include <rte_lcore.h>
+#include <rte_spinlock.h>
 #include <rte_common.h>
 #include <rte_interrupts.h>
+
 #include "eal_private.h"
 
+#define MAX_INTR_EVENTS 16
+
+/**
+ * union buffer for reading on different devices
+ */
+union rte_intr_read_buffer {
+	char charbuf[16];                /* for others */
+};
+
+TAILQ_HEAD(rte_intr_cb_list, rte_intr_callback);
+TAILQ_HEAD(rte_intr_source_list, rte_intr_source);
+
+struct rte_intr_callback {
+	TAILQ_ENTRY(rte_intr_callback) next;
+	rte_intr_callback_fn cb_fn;  /**< callback address */
+	void *cb_arg;                /**< parameter for callback */
+};
+
+struct rte_intr_source {
+	TAILQ_ENTRY(rte_intr_source) next;
+	struct rte_intr_handle intr_handle; /**< interrupt handle */
+	struct rte_intr_cb_list callbacks;  /**< user callbacks */
+	uint32_t active;
+};
+
+/* global spinlock for interrupt data operation */
+static rte_spinlock_t intr_lock = RTE_SPINLOCK_INITIALIZER;
+
+/* interrupt sources list */
+static struct rte_intr_source_list intr_sources;
+
+/* interrupt handling thread */
+static pthread_t intr_thread;
+
+static volatile int kq = -1;
+
+static int
+intr_source_to_kevent(const struct rte_intr_handle *ih, struct kevent *ke)
+{
+	ke->filter = EVFILT_READ;
+	ke->ident = ih->fd;
+
+	return 0;
+}
+
 int
 rte_intr_callback_register(const struct rte_intr_handle *intr_handle,
-			rte_intr_callback_fn cb,
-			void *cb_arg)
+		rte_intr_callback_fn cb, void *cb_arg)
 {
-	RTE_SET_USED(intr_handle);
-	RTE_SET_USED(cb);
-	RTE_SET_USED(cb_arg);
+	struct rte_intr_callback *callback = NULL;
+	struct rte_intr_source *src = NULL;
+	int ret, add_event;
 
-	return -ENOTSUP;
+	/* first do parameter checking */
+	if (intr_handle == NULL || intr_handle->fd < 0 || cb == NULL) {
+		RTE_LOG(ERR, EAL,
+			"Registering with invalid input parameter\n");
+		return -EINVAL;
+	}
+	if (kq < 0) {
+		RTE_LOG(ERR, EAL, "Kqueue is not active: %d\n", kq);
+		return -ENODEV;
+	}
+
+	/* allocate a new interrupt callback entity */
+	callback = calloc(1, sizeof(*callback));
+	if (callback == NULL) {
+		RTE_LOG(ERR, EAL, "Can not allocate memory\n");
+		return -ENOMEM;
+	}
+	callback->cb_fn = cb;
+	callback->cb_arg = cb_arg;
+
+	rte_spinlock_lock(&intr_lock);
+
+	/* check if there is at least one callback registered for the fd */
+	TAILQ_FOREACH(src, &intr_sources, next) {
+		if (src->intr_handle.fd == intr_handle->fd) {
+			/* we had no interrupts for this */
+			if (TAILQ_EMPTY(&src->callbacks))
+				add_event = 1;
+
+			TAILQ_INSERT_TAIL(&(src->callbacks), callback, next);
+			ret = 0;
+			break;
+		}
+	}
+
+	/* no existing callbacks for this - add new source */
+	if (src == NULL) {
+		src = calloc(1, sizeof(*src));
+		if (src == NULL) {
+			RTE_LOG(ERR, EAL, "Can not allocate memory\n");
+			ret = -ENOMEM;
+			goto fail;
+		} else {
+			src->intr_handle = *intr_handle;
+			TAILQ_INIT(&src->callbacks);
+			TAILQ_INSERT_TAIL(&(src->callbacks), callback, next);
+			TAILQ_INSERT_TAIL(&intr_sources, src, next);
+			add_event = 1;
+			ret = 0;
+		}
+	}
+
+	/* add events to the queue */
+	if (add_event) {
+		struct kevent ke;
+
+		memset(&ke, 0, sizeof(ke));
+		ke.flags = EV_ADD; /* mark for addition to the queue */
+
+		if (intr_source_to_kevent(intr_handle, &ke) < 0) {
+			RTE_LOG(ERR, EAL, "Cannot convert interrupt handle to kevent\n");
+			ret = -ENODEV;
+			goto fail;
+		}
+
+		/**
+		 * add the intr file descriptor into wait list.
+		 */
+		if (kevent(kq, &ke, 1, NULL, 0, NULL) < 0) {
+			RTE_LOG(ERR, EAL, "Error adding fd %d kevent, %s\n",
+				src->intr_handle.fd, strerror(errno));
+			ret = -errno;
+			goto fail;
+		}
+	}
+	rte_spinlock_unlock(&intr_lock);
+
+	return ret;
+fail:
+	/* clean up */
+	if (src != NULL) {
+		TAILQ_REMOVE(&(src->callbacks), callback, next);
+		if (TAILQ_EMPTY(&(src->callbacks))) {
+			TAILQ_REMOVE(&intr_sources, src, next);
+			free(src);
+		}
+	}
+	free(callback);
+	rte_spinlock_unlock(&intr_lock);
+	return ret;
 }
 
 int
 rte_intr_callback_unregister(const struct rte_intr_handle *intr_handle,
-			rte_intr_callback_fn cb,
-			void *cb_arg)
+		rte_intr_callback_fn cb_fn, void *cb_arg)
 {
-	RTE_SET_USED(intr_handle);
-	RTE_SET_USED(cb);
-	RTE_SET_USED(cb_arg);
+	int ret;
+	struct rte_intr_source *src;
+	struct rte_intr_callback *cb, *next;
 
-	return -ENOTSUP;
+	/* do parameter checking first */
+	if (intr_handle == NULL || intr_handle->fd < 0) {
+		RTE_LOG(ERR, EAL,
+		"Unregistering with invalid input parameter\n");
+		return -EINVAL;
+	}
+	if (kq < 0) {
+		RTE_LOG(ERR, EAL, "Kqueue is not active\n");
+		return -ENODEV;
+	}
+
+	rte_spinlock_lock(&intr_lock);
+
+	/* check if the insterrupt source for the fd is existent */
+	TAILQ_FOREACH(src, &intr_sources, next)
+		if (src->intr_handle.fd == intr_handle->fd)
+			break;
+
+	/* No interrupt source registered for the fd */
+	if (src == NULL) {
+		ret = -ENOENT;
+
+	/* interrupt source has some active callbacks right now. */
+	} else if (src->active != 0) {
+		ret = -EAGAIN;
+
+	/* ok to remove. */
+	} else {
+		struct kevent ke;
+
+		ret = 0;
+
+		/* remove it from the kqueue */
+		memset(&ke, 0, sizeof(ke));
+		ke.flags = EV_DELETE; /* mark for deletion from the queue */
+
+		if (intr_source_to_kevent(intr_handle, &ke) < 0) {
+			RTE_LOG(ERR, EAL, "Cannot convert to kevent\n");
+			ret = -ENODEV;
+			goto out;
+		}
+
+		/**
+		 * remove intr file descriptor from wait list.
+		 */
+		if (kevent(kq, &ke, 1, NULL, 0, NULL) < 0) {
+			RTE_LOG(ERR, EAL, "Error removing fd %d kevent, %s\n",
+				src->intr_handle.fd, strerror(errno));
+			ret = -errno;
+			goto out;
+		}
+
+		/*walk through the callbacks and remove all that match. */
+		for (cb = TAILQ_FIRST(&src->callbacks); cb != NULL; cb = next) {
+			next = TAILQ_NEXT(cb, next);
+			if (cb->cb_fn == cb_fn && (cb_arg == (void *)-1 ||
+					cb->cb_arg == cb_arg)) {
+				TAILQ_REMOVE(&src->callbacks, cb, next);
+				free(cb);
+				ret++;
+			}
+		}
+
+		/* all callbacks for that source are removed. */
+		if (TAILQ_EMPTY(&src->callbacks)) {
+			TAILQ_REMOVE(&intr_sources, src, next);
+			free(src);
+		}
+	}
+out:
+	rte_spinlock_unlock(&intr_lock);
+
+	return ret;
 }
 
 int
-rte_intr_enable(const struct rte_intr_handle *intr_handle __rte_unused)
+rte_intr_enable(const struct rte_intr_handle *intr_handle)
 {
-	return -ENOTSUP;
+	if (intr_handle && intr_handle->type == RTE_INTR_HANDLE_VDEV)
+		return 0;
+
+	if (!intr_handle || intr_handle->fd < 0 || intr_handle->uio_cfg_fd < 0)
+		return -1;
+
+	switch (intr_handle->type) {
+	/* not used at this moment */
+	case RTE_INTR_HANDLE_ALARM:
+		return -1;
+	/* not used at this moment */
+	case RTE_INTR_HANDLE_DEV_EVENT:
+		return -1;
+	case RTE_INTR_HANDLE_IPC:
+		return -1;
+	/* unknown handle type */
+	default:
+		RTE_LOG(ERR, EAL,
+			"Unknown handle type of fd %d\n",
+					intr_handle->fd);
+		return -1;
+	}
+
+	return 0;
 }
 
 int
-rte_intr_disable(const struct rte_intr_handle *intr_handle __rte_unused)
+rte_intr_disable(const struct rte_intr_handle *intr_handle)
 {
-	return -ENOTSUP;
+	if (intr_handle && intr_handle->type == RTE_INTR_HANDLE_VDEV)
+		return 0;
+
+	if (!intr_handle || intr_handle->fd < 0 || intr_handle->uio_cfg_fd < 0)
+		return -1;
+
+	switch (intr_handle->type) {
+	/* not used at this moment */
+	case RTE_INTR_HANDLE_ALARM:
+		return -1;
+	/* not used at this moment */
+	case RTE_INTR_HANDLE_DEV_EVENT:
+		return -1;
+	case RTE_INTR_HANDLE_IPC:
+		return -1;
+	/* unknown handle type */
+	default:
+		RTE_LOG(ERR, EAL,
+			"Unknown handle type of fd %d\n",
+					intr_handle->fd);
+		return -1;
+	}
+
+	return 0;
+}
+
+static void
+eal_intr_process_interrupts(struct kevent *events, int nfds)
+{
+	struct rte_intr_callback active_cb;
+	union rte_intr_read_buffer buf;
+	struct rte_intr_callback *cb;
+	struct rte_intr_source *src;
+	bool call = false;
+	int n, bytes_read;
+
+	for (n = 0; n < nfds; n++) {
+		int event_fd = events[n].ident;
+
+		rte_spinlock_lock(&intr_lock);
+		TAILQ_FOREACH(src, &intr_sources, next)
+			if (src->intr_handle.fd == event_fd)
+				break;
+		if (src == NULL) {
+			rte_spinlock_unlock(&intr_lock);
+			continue;
+		}
+
+		/* mark this interrupt source as active and release the lock. */
+		src->active = 1;
+		rte_spinlock_unlock(&intr_lock);
+
+		/* set the length to be read dor different handle type */
+		switch (src->intr_handle.type) {
+		case RTE_INTR_HANDLE_ALARM:
+			bytes_read = 0;
+			call = true;
+			break;
+		case RTE_INTR_HANDLE_VDEV:
+		case RTE_INTR_HANDLE_EXT:
+			bytes_read = 0;
+			call = true;
+			break;
+		case RTE_INTR_HANDLE_DEV_EVENT:
+		case RTE_INTR_HANDLE_IPC:
+			bytes_read = 0;
+			call = true;
+			break;
+		default:
+			bytes_read = 1;
+			break;
+		}
+
+		if (bytes_read > 0) {
+			/**
+			 * read out to clear the ready-to-be-read flag
+			 * for epoll_wait.
+			 */
+			bytes_read = read(event_fd, &buf, bytes_read);
+			if (bytes_read < 0) {
+				if (errno == EINTR || errno == EWOULDBLOCK)
+					continue;
+
+				RTE_LOG(ERR, EAL, "Error reading from file "
+					"descriptor %d: %s\n",
+					event_fd,
+					strerror(errno));
+			} else if (bytes_read == 0)
+				RTE_LOG(ERR, EAL, "Read nothing from file "
+					"descriptor %d\n", event_fd);
+			else
+				call = true;
+		}
+
+		/* grab a lock, again to call callbacks and update status. */
+		rte_spinlock_lock(&intr_lock);
+
+		if (call) {
+			/* Finally, call all callbacks. */
+			TAILQ_FOREACH(cb, &src->callbacks, next) {
+
+				/* make a copy and unlock. */
+				active_cb = *cb;
+				rte_spinlock_unlock(&intr_lock);
+
+				/* call the actual callback */
+				active_cb.cb_fn(active_cb.cb_arg);
+
+				/*get the lock back. */
+				rte_spinlock_lock(&intr_lock);
+			}
+		}
+
+		/* we done with that interrupt source, release it. */
+		src->active = 0;
+		rte_spinlock_unlock(&intr_lock);
+	}
+}
+
+static void *
+eal_intr_thread_main(void *arg __rte_unused)
+{
+	struct kevent events[MAX_INTR_EVENTS];
+	int nfds;
+
+	/* host thread, never break out */
+	for (;;) {
+		/* do not change anything, just wait */
+		nfds = kevent(kq, NULL, 0, events, MAX_INTR_EVENTS, NULL);
+
+		/* kevent fail */
+		if (nfds < 0) {
+			if (errno == EINTR)
+				continue;
+			RTE_LOG(ERR, EAL,
+				"kevent returns with fail\n");
+			break;
+		}
+		/* kevent timeout, will never happen here */
+		else if (nfds == 0)
+			continue;
+
+		/* kevent has at least one fd ready to read */
+		eal_intr_process_interrupts(events, nfds);
+	}
+	close(kq);
+	kq = -1;
+	return NULL;
 }
 
 int
 rte_eal_intr_init(void)
 {
-	return 0;
+	int ret = 0;
+
+	/* init the global interrupt source head */
+	TAILQ_INIT(&intr_sources);
+
+	kq = kqueue();
+	if (kq < 0) {
+		RTE_LOG(ERR, EAL, "Cannot create kqueue instance\n");
+		return -1;
+	}
+
+	/* create the host thread to wait/handle the interrupt */
+	ret = rte_ctrl_thread_create(&intr_thread, "eal-intr-thread", NULL,
+			eal_intr_thread_main, NULL);
+	if (ret != 0) {
+		rte_errno = -ret;
+		RTE_LOG(ERR, EAL,
+			"Failed to create thread for interrupt handling\n");
+	}
+
+	return ret;
 }
 
 int
-- 
2.17.1

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [dpdk-dev] [PATCH 7/8] eal/bsdapp: add alarm support
  2018-06-15 14:25 [dpdk-dev] [PATCH 0/8] Remove IPC threads Anatoly Burakov
                   ` (5 preceding siblings ...)
  2018-06-15 14:25 ` [dpdk-dev] [PATCH 6/8] eal/bsdapp: add " Anatoly Burakov
@ 2018-06-15 14:25 ` Anatoly Burakov
  2018-06-15 14:25 ` [dpdk-dev] [PATCH 8/8] ipc: remove main IPC thread Anatoly Burakov
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 25+ messages in thread
From: Anatoly Burakov @ 2018-06-15 14:25 UTC (permalink / raw)
  To: dev; +Cc: Bruce Richardson, konstantin.ananyev, thomas

Implement EAL alarm API support for FreeBSD. The implementation
is largely identical to that of Linux version, with one key
difference.

The alarm API is a little Linux-centric in that it is expecting
the alarm API to manage alarm timeouts without involvement of the
interrupt thread. This works on Linux because in Linux, there's
timerfd API which allows waiting for timer events on an fd.

On FreeBSD, however, there are no timerfd's, and timer events are
set up directly in kevent. There is no way to pass information from
the alarm API to the interrupt thread, so we also add a little
back-channel magic to get soonest alarm timeout from the alarm API.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
 lib/librte_eal/bsdapp/eal/eal_alarm.c         | 299 +++++++++++++++++-
 lib/librte_eal/bsdapp/eal/eal_alarm_private.h |  19 ++
 lib/librte_eal/bsdapp/eal/eal_interrupts.c    |  29 +-
 3 files changed, 334 insertions(+), 13 deletions(-)
 create mode 100644 lib/librte_eal/bsdapp/eal/eal_alarm_private.h

diff --git a/lib/librte_eal/bsdapp/eal/eal_alarm.c b/lib/librte_eal/bsdapp/eal/eal_alarm.c
index eb3913c97..55763e520 100644
--- a/lib/librte_eal/bsdapp/eal/eal_alarm.c
+++ b/lib/librte_eal/bsdapp/eal/eal_alarm.c
@@ -1,31 +1,314 @@
 /* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(c) 2010-2014 Intel Corporation
+ * Copyright(c) 2018 Intel Corporation
  */
+
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <stdio.h>
 #include <stdlib.h>
+#include <string.h>
+#include <time.h>
 #include <errno.h>
 
 #include <rte_alarm.h>
+#include <rte_cycles.h>
 #include <rte_common.h>
+#include <rte_errno.h>
+#include <rte_interrupts.h>
+#include <rte_spinlock.h>
+
 #include "eal_private.h"
+#include "eal_alarm_private.h"
+
+#define NS_PER_US 1000
+
+#ifdef CLOCK_MONOTONIC_RAW /* Defined in glibc bits/time.h */
+#define CLOCK_TYPE_ID CLOCK_MONOTONIC_RAW
+#else
+#define CLOCK_TYPE_ID CLOCK_MONOTONIC
+#endif
+
+struct alarm_entry {
+	LIST_ENTRY(alarm_entry) next;
+	struct rte_intr_handle handle;
+	struct timespec time;
+	rte_eal_alarm_callback cb_fn;
+	void *cb_arg;
+	volatile uint8_t executing;
+	volatile pthread_t executing_id;
+};
+
+static LIST_HEAD(alarm_list, alarm_entry) alarm_list = LIST_HEAD_INITIALIZER();
+static rte_spinlock_t alarm_list_lk = RTE_SPINLOCK_INITIALIZER;
+
+static struct rte_intr_handle intr_handle = {.fd = -1 };
+static void eal_alarm_callback(void *arg);
 
 int
 rte_eal_alarm_init(void)
 {
+	intr_handle.type = RTE_INTR_HANDLE_ALARM;
+
+	/* on FreeBSD, timers don't use fd's, and their identifiers are stored
+	 * in separate namespace from fd's, so using any value is OK. however,
+	 * EAL interrupts handler expects fd's to be unique, so use an actual fd
+	 * to guarantee unique timer identifier.
+	 */
+	intr_handle.fd = open("/dev/zero", O_RDONLY);
+
+	return 0;
+}
+
+static inline int
+timespec_cmp(const struct timespec *now, const struct timespec *at)
+{
+	if (now->tv_sec < at->tv_sec)
+		return -1;
+	if (now->tv_sec > at->tv_sec)
+		return 1;
+	if (now->tv_nsec < at->tv_nsec)
+		return -1;
+	if (now->tv_nsec > at->tv_nsec)
+		return 1;
+	return 0;
+}
+
+static inline uint64_t
+diff_ns(struct timespec *now, struct timespec *at)
+{
+	uint64_t now_ns, at_ns;
+
+	if (timespec_cmp(now, at) >= 0)
+		return 0;
+
+	now_ns = now->tv_sec * NS_PER_S + now->tv_nsec;
+	at_ns = at->tv_sec * NS_PER_S + at->tv_nsec;
+
+	return at_ns - now_ns;
+}
+
+int
+eal_alarm_get_timeout_ns(uint64_t *val)
+{
+	struct alarm_entry *ap;
+	struct timespec now;
+
+	if (clock_gettime(CLOCK_TYPE_ID, &now) < 0)
+		return -1;
+
+	if (LIST_EMPTY(&alarm_list))
+		return -1;
+
+	ap = LIST_FIRST(&alarm_list);
+
+	*val = diff_ns(&now, &ap->time);
+
 	return 0;
 }
 
+static int
+unregister_current_callback(void)
+{
+	struct alarm_entry *ap;
+	int ret = 0;
+
+	if (!LIST_EMPTY(&alarm_list)) {
+		ap = LIST_FIRST(&alarm_list);
+
+		do {
+			ret = rte_intr_callback_unregister(&intr_handle,
+				eal_alarm_callback, &ap->time);
+		} while (ret == -EAGAIN);
+	}
+
+	return ret;
+}
+
+static int
+register_first_callback(void)
+{
+	struct alarm_entry *ap;
+	int ret = 0;
+
+	if (!LIST_EMPTY(&alarm_list)) {
+		ap = LIST_FIRST(&alarm_list);
+
+		/* register a new callback */
+		ret = rte_intr_callback_register(&intr_handle,
+				eal_alarm_callback, &ap->time);
+	}
+	return ret;
+}
+
+static void
+eal_alarm_callback(void *arg __rte_unused)
+{
+	struct timespec now;
+	struct alarm_entry *ap;
+
+	rte_spinlock_lock(&alarm_list_lk);
+	ap = LIST_FIRST(&alarm_list);
+
+	if (clock_gettime(CLOCK_TYPE_ID, &now) < 0)
+		return;
+
+	while (ap != NULL && timespec_cmp(&now, &ap->time) >= 0) {
+		ap->executing = 1;
+		ap->executing_id = pthread_self();
+		rte_spinlock_unlock(&alarm_list_lk);
+
+		ap->cb_fn(ap->cb_arg);
+
+		rte_spinlock_lock(&alarm_list_lk);
+
+		LIST_REMOVE(ap, next);
+		free(ap);
+
+		ap = LIST_FIRST(&alarm_list);
+	}
+
+	/* timer has been deleted from the kqueue, so recreate it if needed */
+	register_first_callback();
+
+	rte_spinlock_unlock(&alarm_list_lk);
+}
+
 
 int
-rte_eal_alarm_set(uint64_t us __rte_unused,
-		rte_eal_alarm_callback cb_fn __rte_unused,
-		void *cb_arg __rte_unused)
+rte_eal_alarm_set(uint64_t us, rte_eal_alarm_callback cb_fn, void *cb_arg)
 {
-	return -ENOTSUP;
+	struct alarm_entry *ap, *new_alarm;
+	struct timespec now;
+	uint64_t ns;
+	int ret = 0;
+
+	/* check parameters, also ensure us won't cause a uint64_t overflow */
+	if (us < 1 || us > (UINT64_MAX - US_PER_S) || cb_fn == NULL)
+		return -EINVAL;
+
+	new_alarm = calloc(1, sizeof(*new_alarm));
+	if (new_alarm == NULL)
+		return -ENOMEM;
+
+	/* use current time to calculate absolute time of alarm */
+	clock_gettime(CLOCK_TYPE_ID, &now);
+
+	ns = us * NS_PER_US;
+
+	new_alarm->cb_fn = cb_fn;
+	new_alarm->cb_arg = cb_arg;
+	new_alarm->time.tv_nsec = (now.tv_nsec + ns) % NS_PER_S;
+	new_alarm->time.tv_sec = now.tv_sec + ((now.tv_nsec + ns) / NS_PER_S);
+
+	rte_spinlock_lock(&alarm_list_lk);
+
+	if (LIST_EMPTY(&alarm_list))
+		LIST_INSERT_HEAD(&alarm_list, new_alarm, next);
+	else {
+		LIST_FOREACH(ap, &alarm_list, next) {
+			if (timespec_cmp(&new_alarm->time, &ap->time) < 0) {
+				LIST_INSERT_BEFORE(ap, new_alarm, next);
+				break;
+			}
+			if (LIST_NEXT(ap, next) == NULL) {
+				LIST_INSERT_AFTER(ap, new_alarm, next);
+				break;
+			}
+		}
+	}
+
+	/* re-register first callback just in case */
+	register_first_callback();
+
+	rte_spinlock_unlock(&alarm_list_lk);
+
+	return ret;
 }
 
 int
-rte_eal_alarm_cancel(rte_eal_alarm_callback cb_fn __rte_unused,
-		void *cb_arg __rte_unused)
+rte_eal_alarm_cancel(rte_eal_alarm_callback cb_fn, void *cb_arg)
 {
-	return -ENOTSUP;
+	struct alarm_entry *ap, *ap_prev;
+	int count = 0;
+	int err = 0;
+	int executing;
+
+	if (!cb_fn) {
+		rte_errno = EINVAL;
+		return -1;
+	}
+
+	do {
+		executing = 0;
+		rte_spinlock_lock(&alarm_list_lk);
+		/* remove any matches at the start of the list */
+		while (1) {
+			ap = LIST_FIRST(&alarm_list);
+			if (ap == NULL)
+				break;
+			if (cb_fn != ap->cb_fn)
+				break;
+			if (cb_arg != ap->cb_arg && cb_arg != (void *) -1)
+				break;
+			if (ap->executing == 0) {
+				LIST_REMOVE(ap, next);
+				free(ap);
+				count++;
+			} else {
+				/* If calling from other context, mark that
+				 * alarm is executing so loop can spin till it
+				 * finish. Otherwise we are trying to cancel
+				 * ourselves - mark it by EINPROGRESS.
+				 */
+				if (pthread_equal(ap->executing_id,
+						pthread_self()) == 0)
+					executing++;
+				else
+					err = EINPROGRESS;
+
+				break;
+			}
+		}
+		ap_prev = ap;
+
+		/* now go through list, removing entries not at start */
+		LIST_FOREACH(ap, &alarm_list, next) {
+			/* this won't be true first time through */
+			if (cb_fn == ap->cb_fn &&
+					(cb_arg == (void *)-1 ||
+					 cb_arg == ap->cb_arg)) {
+				if (ap->executing == 0) {
+					LIST_REMOVE(ap, next);
+					free(ap);
+					count++;
+					ap = ap_prev;
+				} else if (pthread_equal(ap->executing_id,
+							 pthread_self()) == 0) {
+					executing++;
+				} else {
+					err = EINPROGRESS;
+				}
+			}
+			ap_prev = ap;
+		}
+		rte_spinlock_unlock(&alarm_list_lk);
+	} while (executing != 0);
+
+	if (count == 0 && err == 0)
+		rte_errno = ENOENT;
+	else if (err)
+		rte_errno = err;
+
+	rte_spinlock_lock(&alarm_list_lk);
+
+	/* unregister if no alarms left, otherwise re-register first */
+	if (LIST_EMPTY(&alarm_list))
+		unregister_current_callback();
+	else
+		register_first_callback();
+
+	rte_spinlock_unlock(&alarm_list_lk);
+
+	return count;
 }
diff --git a/lib/librte_eal/bsdapp/eal/eal_alarm_private.h b/lib/librte_eal/bsdapp/eal/eal_alarm_private.h
new file mode 100644
index 000000000..65c711518
--- /dev/null
+++ b/lib/librte_eal/bsdapp/eal/eal_alarm_private.h
@@ -0,0 +1,19 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#ifndef EAL_ALARM_PRIVATE_H
+#define EAL_ALARM_PRIVATE_H
+
+#include <inttypes.h>
+
+/*
+ * FreeBSD needs a back-channel communication mechanism between interrupt and
+ * alarm thread, because on FreeBSD, timer period is set up inside the interrupt
+ * API and not inside alarm API like on Linux.
+ */
+
+int
+eal_alarm_get_timeout_ns(uint64_t *val);
+
+#endif // EAL_ALARM_PRIVATE_H
diff --git a/lib/librte_eal/bsdapp/eal/eal_interrupts.c b/lib/librte_eal/bsdapp/eal/eal_interrupts.c
index 0fe068894..70ef844ff 100644
--- a/lib/librte_eal/bsdapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/bsdapp/eal/eal_interrupts.c
@@ -16,6 +16,7 @@
 #include <rte_interrupts.h>
 
 #include "eal_private.h"
+#include "eal_alarm_private.h"
 
 #define MAX_INTR_EVENTS 16
 
@@ -56,7 +57,22 @@ static volatile int kq = -1;
 static int
 intr_source_to_kevent(const struct rte_intr_handle *ih, struct kevent *ke)
 {
-	ke->filter = EVFILT_READ;
+	/* alarm callbacks are special case */
+	if (ih->type == RTE_INTR_HANDLE_ALARM) {
+		uint64_t timeout_ns;
+
+		/* get soonest alarm timeout */
+		if (eal_alarm_get_timeout_ns(&timeout_ns) < 0)
+			return -1;
+
+		ke->filter = EVFILT_TIMER;
+		/* timers are one shot */
+		ke->flags |= EV_ONESHOT;
+		ke->fflags = NOTE_NSECONDS;
+		ke->data = timeout_ns;
+	} else {
+		ke->filter = EVFILT_READ;
+	}
 	ke->ident = ih->fd;
 
 	return 0;
@@ -122,8 +138,10 @@ rte_intr_callback_register(const struct rte_intr_handle *intr_handle,
 		}
 	}
 
-	/* add events to the queue */
-	if (add_event) {
+	/* add events to the queue. timer events are special as we need to
+	 * re-set the timer.
+	 */
+	if (add_event || src->intr_handle.type == RTE_INTR_HANDLE_ALARM) {
 		struct kevent ke;
 
 		memset(&ke, 0, sizeof(ke));
@@ -218,8 +236,9 @@ rte_intr_callback_unregister(const struct rte_intr_handle *intr_handle,
 		if (kevent(kq, &ke, 1, NULL, 0, NULL) < 0) {
 			RTE_LOG(ERR, EAL, "Error removing fd %d kevent, %s\n",
 				src->intr_handle.fd, strerror(errno));
-			ret = -errno;
-			goto out;
+			/* removing non-existent even is an expected condition
+			 * in some circumstances (e.g. oneshot events).
+			 */
 		}
 
 		/*walk through the callbacks and remove all that match. */
-- 
2.17.1

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [dpdk-dev] [PATCH 8/8] ipc: remove main IPC thread
  2018-06-15 14:25 [dpdk-dev] [PATCH 0/8] Remove IPC threads Anatoly Burakov
                   ` (6 preceding siblings ...)
  2018-06-15 14:25 ` [dpdk-dev] [PATCH 7/8] eal/bsdapp: add alarm support Anatoly Burakov
@ 2018-06-15 14:25 ` Anatoly Burakov
  2018-06-26  1:19 ` [dpdk-dev] [PATCH 0/8] Remove IPC threads Zhang, Qi Z
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 25+ messages in thread
From: Anatoly Burakov @ 2018-06-15 14:25 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, thomas, bruce.richardson, Jianfeng Tan

Previously, to handle requests from peer(s), or replies for a
request (sync or async) by itself, a dedicated IPC thread was set
up.

Now that every other piece of the puzzle is in place, we can get rid
of the IPC thread, and move waiting for IPC messages entirely into the
interrupt thread.

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Suggested-by: Thomas Monjalon <thomas@monjalon.net>
---

Notes:
    RFC->RFCv2:
    - Fixed resource leaks
    - Improved readability

 lib/librte_eal/common/eal_common_proc.c | 46 ++++++++++++++-----------
 1 file changed, 25 insertions(+), 21 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_proc.c b/lib/librte_eal/common/eal_common_proc.c
index 6f3366403..162d67ca5 100644
--- a/lib/librte_eal/common/eal_common_proc.c
+++ b/lib/librte_eal/common/eal_common_proc.c
@@ -25,6 +25,7 @@
 #include <rte_cycles.h>
 #include <rte_eal.h>
 #include <rte_errno.h>
+#include <rte_interrupts.h>
 #include <rte_lcore.h>
 #include <rte_log.h>
 #include <rte_tailq.h>
@@ -101,6 +102,8 @@ static struct {
 	/**< used in async requests only */
 };
 
+static struct rte_intr_handle ipc_intr_handle;
+
 /* forward declarations */
 static int
 mp_send(struct rte_mp_msg *msg, const char *peer, int type);
@@ -350,18 +353,17 @@ process_msg(struct mp_msg_internal *m, struct sockaddr_un *s)
 	}
 }
 
-static void *
+static void
 mp_handle(void *arg __rte_unused)
 {
 	struct mp_msg_internal msg;
 	struct sockaddr_un sa;
 
 	while (1) {
-		if (read_msg(&msg, &sa) == 0)
-			process_msg(&msg, &sa);
+		if (read_msg(&msg, &sa) < 0)
+			break;
+		process_msg(&msg, &sa);
 	}
-
-	return NULL;
 }
 
 static int
@@ -570,7 +572,6 @@ rte_mp_channel_init(void)
 {
 	char path[PATH_MAX];
 	int dir_fd;
-	pthread_t mp_handle_tid;
 
 	/* create filter path */
 	create_socket_path("*", path, sizeof(path));
@@ -585,36 +586,32 @@ rte_mp_channel_init(void)
 	if (dir_fd < 0) {
 		RTE_LOG(ERR, EAL, "failed to open %s: %s\n",
 			mp_dir_path, strerror(errno));
-		return -1;
+		goto fail;
 	}
 
 	if (flock(dir_fd, LOCK_EX)) {
 		RTE_LOG(ERR, EAL, "failed to lock %s: %s\n",
 			mp_dir_path, strerror(errno));
-		close(dir_fd);
-		return -1;
+		goto fail;
 	}
 
 	if (rte_eal_process_type() == RTE_PROC_PRIMARY &&
 			unlink_sockets(mp_filter)) {
 		RTE_LOG(ERR, EAL, "failed to unlink mp sockets\n");
-		close(dir_fd);
-		return -1;
+		goto fail;
 	}
 
 	if (open_socket_fd() < 0) {
-		close(dir_fd);
-		return -1;
+		goto fail;
 	}
 
-	if (rte_ctrl_thread_create(&mp_handle_tid, "rte_mp_handle",
-			NULL, mp_handle, NULL) < 0) {
-		RTE_LOG(ERR, EAL, "failed to create mp thead: %s\n",
-			strerror(errno));
-		close(mp_fd);
-		close(dir_fd);
-		mp_fd = -1;
-		return -1;
+	ipc_intr_handle.fd = mp_fd;
+	ipc_intr_handle.type = RTE_INTR_HANDLE_IPC;
+
+	if (rte_intr_callback_register(&ipc_intr_handle, mp_handle, NULL) < 0) {
+		RTE_LOG(ERR, EAL, "failed to register IPC interrupt callback: %s\n",
+				strerror(errno));
+		goto fail;
 	}
 
 	/* unlock the directory */
@@ -622,6 +619,13 @@ rte_mp_channel_init(void)
 	close(dir_fd);
 
 	return 0;
+fail:
+	if (dir_fd >= 0)
+		close(dir_fd);
+	if (mp_fd >= 0)
+		close(mp_fd);
+	mp_fd = -1;
+	return -1;
 }
 
 /**
-- 
2.17.1

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dpdk-dev] [PATCH 0/8] Remove IPC threads
  2018-06-15 14:25 [dpdk-dev] [PATCH 0/8] Remove IPC threads Anatoly Burakov
                   ` (7 preceding siblings ...)
  2018-06-15 14:25 ` [dpdk-dev] [PATCH 8/8] ipc: remove main IPC thread Anatoly Burakov
@ 2018-06-26  1:19 ` Zhang, Qi Z
  2018-06-26  7:03 ` Zhang, Qi Z
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 25+ messages in thread
From: Zhang, Qi Z @ 2018-06-26  1:19 UTC (permalink / raw)
  To: Burakov, Anatoly, dev; +Cc: Ananyev, Konstantin, thomas, Richardson, Bruce

Hi Anatoly and Thomas:

Sorry for raise this late, but seems merge mp thread into interrupt thread gives problem to enable hotplug on secondary [1].

The issue is, when secondary want to attach a share device, it send request to primary
Then primary is running in mp callback (mp thread) to attach device, it will call rte_malloc which get chance to increase heap that will do sync IPC,
You know, this is the limitation we can't do sync IPC in mp thread itself. so the solution is try to move real work to a separate thread which has no limitation to do sync IPC, 
and interrupt thread is the good candidate, because we just need to call rte_eal_set_alarm and we don't need to worry about the execution sequence.

But if we merge mp thread into interrupt thread, the solution will not work, we may need to create specific temporal thread to handle callback, but this looks like some re-build which we already have.
So I think we need to revisit if we need to merge the thread before we have a good solution for such kind of issue.

Thanks
Qi

[1] https://mails.dpdk.org/archives/dev/2018-June/105018.html


> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Anatoly Burakov
> Sent: Friday, June 15, 2018 10:25 PM
> To: dev@dpdk.org
> Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>;
> thomas@monjalon.net; Richardson, Bruce <bruce.richardson@intel.com>
> Subject: [dpdk-dev] [PATCH 0/8] Remove IPC threads
> 
> As previously discussed [1], IPC threads need to be removed and their
> workload moved to interrupt thread.
> 
> FreeBSD did not have an interrupt thread, nor did it support alarm API. This
> patchset adds support for both on FreeBSD. FreeBSD interrupt thread is based
> on kevent, FreeBSD's native event multiplexing mechanism similar to Linux's
> epoll.
> 
> The patchset makes FreeBSD's interrupts and alarm work just enough to
> suffice for purposes of IPC, however there are really weird problems observed.
> Specifically, FreeBSD's kevent timers are really slow to trigger for some reason,
> sleeping on a 10ms timer as much as 200ms before waking up. Interrupt
> handling on fd's is also a bit flaky.
> 
> It has also been observed that both problems go away if we do not affinitize
> master lcore (by commenting relevant code out [2]). It is not known why these
> problems are observed, nor it is clear what a solution might entail.
> 
> For the purposes of making IPC work and having rudimentary support for
> alarm and interrupt API's, this patchset works fine. However, because of the
> above described issues, documentation will not be updated to indicate
> support for interrupts on FreeBSD at this time.
> 
> [1] http://dpdk.org/dev/patchwork/patch/36579/
> [2] http://dpdk.org/browse/dpdk/tree/lib/librte_eal/bsdapp/eal/eal.c#n729
> 
> Anatoly Burakov (4):
>   ipc: remove IPC thread for async requests
>   eal/bsdapp: add interrupt thread
>   eal/bsdapp: add alarm support
>   ipc: remove main IPC thread
> 
> Jianfeng Tan (4):
>   eal/linux: use glibc malloc in alarm
>   eal/linux: use glibc malloc in interrupt handling
>   eal: bring forward init of interrupt handling
>   eal: add IPC type for interrupt thread
> 
>  lib/librte_eal/bsdapp/eal/eal.c               |  10 +-
>  lib/librte_eal/bsdapp/eal/eal_alarm.c         | 299 +++++++++++-
>  lib/librte_eal/bsdapp/eal/eal_alarm_private.h |  19 +
>  lib/librte_eal/bsdapp/eal/eal_interrupts.c    | 460 +++++++++++++++++-
>  lib/librte_eal/common/eal_common_proc.c       | 243 ++++-----
>  .../common/include/rte_eal_interrupts.h       |   1 +
>  lib/librte_eal/linuxapp/eal/eal.c             |  10 +-
>  lib/librte_eal/linuxapp/eal/eal_alarm.c       |   9 +-
>  lib/librte_eal/linuxapp/eal/eal_interrupts.c  |  19 +-
>  test/test/test_interrupts.c                   |  29 +-
>  10 files changed, 899 insertions(+), 200 deletions(-)  create mode 100644
> lib/librte_eal/bsdapp/eal/eal_alarm_private.h
> 
> --
> 2.17.1

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dpdk-dev] [PATCH 0/8] Remove IPC threads
  2018-06-15 14:25 [dpdk-dev] [PATCH 0/8] Remove IPC threads Anatoly Burakov
                   ` (8 preceding siblings ...)
  2018-06-26  1:19 ` [dpdk-dev] [PATCH 0/8] Remove IPC threads Zhang, Qi Z
@ 2018-06-26  7:03 ` Zhang, Qi Z
  2018-06-26 10:53 ` [dpdk-dev] [PATCH v2 0/7] Remove asynchronous IPC thread Anatoly Burakov
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 25+ messages in thread
From: Zhang, Qi Z @ 2018-06-26  7:03 UTC (permalink / raw)
  To: Burakov, Anatoly, dev; +Cc: Ananyev, Konstantin, thomas, Richardson, Bruce


> -----Original Message-----
> From: Zhang, Qi Z
> Sent: Tuesday, June 26, 2018 9:19 AM
> To: 'Anatoly Burakov' <anatoly.burakov@intel.com>; dev@dpdk.org
> Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>;
> thomas@monjalon.net; Richardson, Bruce <bruce.richardson@intel.com>
> Subject: RE: [dpdk-dev] [PATCH 0/8] Remove IPC threads
> 
> Hi Anatoly and Thomas:
> 
> Sorry for raise this late, but seems merge mp thread into interrupt thread gives
> problem to enable hotplug on secondary [1].
> 
> The issue is, when secondary want to attach a share device, it send request to
> primary Then primary is running in mp callback (mp thread) to attach device, it
> will call rte_malloc which get chance to increase heap that will do sync IPC, You
> know, this is the limitation we can't do sync IPC in mp thread itself. so the
> solution is try to move real work to a separate thread which has no limitation
> to do sync IPC, and interrupt thread is the good candidate, because we just
> need to call rte_eal_set_alarm and we don't need to worry about the
> execution sequence.
> 
> But if we merge mp thread into interrupt thread, the solution will not work, we
> may need to create specific temporal thread to handle callback, but this looks
> like some re-build which we already have.

Ok, actually this method looks good to me :), I will send v4 to have this,
But please let me know if you have better idea.

> So I think we need to revisit if we need to merge the thread before we have a
> good solution for such kind of issue.
> 
> Thanks
> Qi
> 
> [1] https://mails.dpdk.org/archives/dev/2018-June/105018.html
> 
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Anatoly Burakov
> > Sent: Friday, June 15, 2018 10:25 PM
> > To: dev@dpdk.org
> > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>;
> > thomas@monjalon.net; Richardson, Bruce <bruce.richardson@intel.com>
> > Subject: [dpdk-dev] [PATCH 0/8] Remove IPC threads
> >
> > As previously discussed [1], IPC threads need to be removed and their
> > workload moved to interrupt thread.
> >
> > FreeBSD did not have an interrupt thread, nor did it support alarm
> > API. This patchset adds support for both on FreeBSD. FreeBSD interrupt
> > thread is based on kevent, FreeBSD's native event multiplexing
> > mechanism similar to Linux's epoll.
> >
> > The patchset makes FreeBSD's interrupts and alarm work just enough to
> > suffice for purposes of IPC, however there are really weird problems
> observed.
> > Specifically, FreeBSD's kevent timers are really slow to trigger for
> > some reason, sleeping on a 10ms timer as much as 200ms before waking
> > up. Interrupt handling on fd's is also a bit flaky.
> >
> > It has also been observed that both problems go away if we do not
> > affinitize master lcore (by commenting relevant code out [2]). It is
> > not known why these problems are observed, nor it is clear what a solution
> might entail.
> >
> > For the purposes of making IPC work and having rudimentary support for
> > alarm and interrupt API's, this patchset works fine. However, because
> > of the above described issues, documentation will not be updated to
> > indicate support for interrupts on FreeBSD at this time.
> >
> > [1] http://dpdk.org/dev/patchwork/patch/36579/
> > [2]
> > http://dpdk.org/browse/dpdk/tree/lib/librte_eal/bsdapp/eal/eal.c#n729
> >
> > Anatoly Burakov (4):
> >   ipc: remove IPC thread for async requests
> >   eal/bsdapp: add interrupt thread
> >   eal/bsdapp: add alarm support
> >   ipc: remove main IPC thread
> >
> > Jianfeng Tan (4):
> >   eal/linux: use glibc malloc in alarm
> >   eal/linux: use glibc malloc in interrupt handling
> >   eal: bring forward init of interrupt handling
> >   eal: add IPC type for interrupt thread
> >
> >  lib/librte_eal/bsdapp/eal/eal.c               |  10 +-
> >  lib/librte_eal/bsdapp/eal/eal_alarm.c         | 299 +++++++++++-
> >  lib/librte_eal/bsdapp/eal/eal_alarm_private.h |  19 +
> >  lib/librte_eal/bsdapp/eal/eal_interrupts.c    | 460 +++++++++++++++++-
> >  lib/librte_eal/common/eal_common_proc.c       | 243 ++++-----
> >  .../common/include/rte_eal_interrupts.h       |   1 +
> >  lib/librte_eal/linuxapp/eal/eal.c             |  10 +-
> >  lib/librte_eal/linuxapp/eal/eal_alarm.c       |   9 +-
> >  lib/librte_eal/linuxapp/eal/eal_interrupts.c  |  19 +-
> >  test/test/test_interrupts.c                   |  29 +-
> >  10 files changed, 899 insertions(+), 200 deletions(-)  create mode
> > 100644 lib/librte_eal/bsdapp/eal/eal_alarm_private.h
> >
> > --
> > 2.17.1

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [dpdk-dev] [PATCH v2 0/7] Remove asynchronous IPC thread
  2018-06-15 14:25 [dpdk-dev] [PATCH 0/8] Remove IPC threads Anatoly Burakov
                   ` (9 preceding siblings ...)
  2018-06-26  7:03 ` Zhang, Qi Z
@ 2018-06-26 10:53 ` Anatoly Burakov
  2018-07-13 10:44   ` Thomas Monjalon
  2018-06-26 10:53 ` [dpdk-dev] [PATCH v2 1/7] eal/linux: use glibc malloc in alarm Anatoly Burakov
                   ` (6 subsequent siblings)
  17 siblings, 1 reply; 25+ messages in thread
From: Anatoly Burakov @ 2018-06-26 10:53 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, thomas, bruce.richardson, qi.z.zhang

As previously discussed [1], IPC threads need to be removed and their
workload moved to interrupt thread. However, due to upcoming changes
for multiprocess hotplug, it was decided to leave one IPC thread in
place for now to avoid deadlocks on memory allocation attempts [2].

FreeBSD did not have an interrupt thread, nor did it support alarm
API. This patchset adds support for both on FreeBSD. FreeBSD interrupt
thread is based on kevent, FreeBSD's native event multiplexing
mechanism similar to Linux's epoll.

The patchset makes FreeBSD's interrupts and alarm work just enough to
suffice for purposes of IPC, however there are really weird problems
observed. Specifically, FreeBSD's kevent timers are really slow to
trigger for some reason, sleeping on a 10ms timer as much as 200ms
before waking up. Interrupt handling on fd's is also a bit flaky.

It has also been observed that both problems go away if we do not
affinitize master lcore (by commenting relevant code out [3]). It is
not known why these problems are observed, nor it is clear what a
solution might entail.

For the purposes of making IPC work and having rudimentary support
for alarm and interrupt API's, this patchset works fine. However,
because of the above described issues, documentation will not be
updated to indicate support for interrupts on FreeBSD at this time.

v1->v2:
- Leave IPC thread in place and only remove asynchronous IPC thread

RFC->v1:
- FreeBSD support

[1] https://patches.dpdk.org/patch/36579/
[2] https://patches.dpdk.org/patch/41512/
[3] https://git.dpdk.org/dpdk/tree/lib/librte_eal/bsdapp/eal/eal.c#n729

Anatoly Burakov (4):
  eal/bsdapp: add interrupt thread
  eal/bsdapp: add alarm support
  ipc: remove IPC thread for async requests
  doc: document IPC callback limitations

Jianfeng Tan (3):
  eal/linux: use glibc malloc in alarm
  eal/linux: use glibc malloc in interrupt handling
  eal: bring forward init of interrupt handling

 doc/guides/prog_guide/multi_proc_support.rst  |  17 +-
 lib/librte_eal/bsdapp/eal/eal.c               |  10 +-
 lib/librte_eal/bsdapp/eal/eal_alarm.c         | 299 +++++++++++-
 lib/librte_eal/bsdapp/eal/eal_alarm_private.h |  19 +
 lib/librte_eal/bsdapp/eal/eal_interrupts.c    | 455 +++++++++++++++++-
 lib/librte_eal/common/eal_common_proc.c       | 201 +++-----
 lib/librte_eal/linuxapp/eal/eal.c             |  10 +-
 lib/librte_eal/linuxapp/eal/eal_alarm.c       |   9 +-
 lib/librte_eal/linuxapp/eal/eal_interrupts.c  |  14 +-
 9 files changed, 852 insertions(+), 182 deletions(-)
 create mode 100644 lib/librte_eal/bsdapp/eal/eal_alarm_private.h

-- 
2.17.1

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [dpdk-dev] [PATCH v2 1/7] eal/linux: use glibc malloc in alarm
  2018-06-15 14:25 [dpdk-dev] [PATCH 0/8] Remove IPC threads Anatoly Burakov
                   ` (10 preceding siblings ...)
  2018-06-26 10:53 ` [dpdk-dev] [PATCH v2 0/7] Remove asynchronous IPC thread Anatoly Burakov
@ 2018-06-26 10:53 ` Anatoly Burakov
  2018-06-26 10:53 ` [dpdk-dev] [PATCH v2 2/7] eal/linux: use glibc malloc in interrupt handling Anatoly Burakov
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 25+ messages in thread
From: Anatoly Burakov @ 2018-06-26 10:53 UTC (permalink / raw)
  To: dev
  Cc: Jianfeng Tan, konstantin.ananyev, thomas, bruce.richardson, qi.z.zhang

From: Jianfeng Tan <jianfeng.tan@intel.com>

Alarm API is going to be used by IPC internally. However, because
memory subsystem depends on IPC, alarm API cannot use rte_malloc as
it creates a circular dependency.

To avoid such chicken and egg problem, we change to use glibc malloc
in the alarm API.

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
 lib/librte_eal/linuxapp/eal/eal_alarm.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_alarm.c b/lib/librte_eal/linuxapp/eal/eal_alarm.c
index c115e823a..391d2a65f 100644
--- a/lib/librte_eal/linuxapp/eal/eal_alarm.c
+++ b/lib/librte_eal/linuxapp/eal/eal_alarm.c
@@ -19,7 +19,6 @@
 #include <rte_launch.h>
 #include <rte_lcore.h>
 #include <rte_errno.h>
-#include <rte_malloc.h>
 #include <rte_spinlock.h>
 #include <eal_private.h>
 
@@ -91,7 +90,7 @@ eal_alarm_callback(void *arg __rte_unused)
 		rte_spinlock_lock(&alarm_list_lk);
 
 		LIST_REMOVE(ap, next);
-		rte_free(ap);
+		free(ap);
 	}
 
 	if (!LIST_EMPTY(&alarm_list)) {
@@ -122,7 +121,7 @@ rte_eal_alarm_set(uint64_t us, rte_eal_alarm_callback cb_fn, void *cb_arg)
 	if (us < 1 || us > (UINT64_MAX - US_PER_S) || cb_fn == NULL)
 		return -EINVAL;
 
-	new_alarm = rte_zmalloc(NULL, sizeof(*new_alarm), 0);
+	new_alarm = calloc(1, sizeof(*new_alarm));
 	if (new_alarm == NULL)
 		return -ENOMEM;
 
@@ -196,7 +195,7 @@ rte_eal_alarm_cancel(rte_eal_alarm_callback cb_fn, void *cb_arg)
 
 			if (ap->executing == 0) {
 				LIST_REMOVE(ap, next);
-				rte_free(ap);
+				free(ap);
 				count++;
 			} else {
 				/* If calling from other context, mark that alarm is executing
@@ -220,7 +219,7 @@ rte_eal_alarm_cancel(rte_eal_alarm_callback cb_fn, void *cb_arg)
 
 				if (ap->executing == 0) {
 					LIST_REMOVE(ap, next);
-					rte_free(ap);
+					free(ap);
 					count++;
 					ap = ap_prev;
 				} else if (pthread_equal(ap->executing_id, pthread_self()) == 0)
-- 
2.17.1

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [dpdk-dev] [PATCH v2 2/7] eal/linux: use glibc malloc in interrupt handling
  2018-06-15 14:25 [dpdk-dev] [PATCH 0/8] Remove IPC threads Anatoly Burakov
                   ` (11 preceding siblings ...)
  2018-06-26 10:53 ` [dpdk-dev] [PATCH v2 1/7] eal/linux: use glibc malloc in alarm Anatoly Burakov
@ 2018-06-26 10:53 ` Anatoly Burakov
  2018-06-26 10:53 ` [dpdk-dev] [PATCH v2 3/7] eal/bsdapp: add interrupt thread Anatoly Burakov
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 25+ messages in thread
From: Anatoly Burakov @ 2018-06-26 10:53 UTC (permalink / raw)
  To: dev
  Cc: Jianfeng Tan, konstantin.ananyev, thomas, bruce.richardson, qi.z.zhang

From: Jianfeng Tan <jianfeng.tan@intel.com>

IPC uses interrupts API internally, and memory subsystem uses IPC.
Therefore, IPC should not use rte_malloc to avoid circular dependency.
Switch to using regular glibc malloc in interrupts API.

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
 lib/librte_eal/linuxapp/eal/eal_interrupts.c | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
index 056d41c12..180c0378a 100644
--- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
@@ -30,7 +30,6 @@
 #include <rte_branch_prediction.h>
 #include <rte_debug.h>
 #include <rte_log.h>
-#include <rte_malloc.h>
 #include <rte_errno.h>
 #include <rte_spinlock.h>
 #include <rte_pause.h>
@@ -405,8 +404,7 @@ rte_intr_callback_register(const struct rte_intr_handle *intr_handle,
 	}
 
 	/* allocate a new interrupt callback entity */
-	callback = rte_zmalloc("interrupt callback list",
-				sizeof(*callback), 0);
+	callback = calloc(1, sizeof(*callback));
 	if (callback == NULL) {
 		RTE_LOG(ERR, EAL, "Can not allocate memory\n");
 		return -ENOMEM;
@@ -431,10 +429,10 @@ rte_intr_callback_register(const struct rte_intr_handle *intr_handle,
 
 	/* no existing callbacks for this - add new source */
 	if (src == NULL) {
-		if ((src = rte_zmalloc("interrupt source list",
-				sizeof(*src), 0)) == NULL) {
+		src = calloc(1, sizeof(*src));
+		if (src == NULL) {
 			RTE_LOG(ERR, EAL, "Can not allocate memory\n");
-			rte_free(callback);
+			free(callback);
 			ret = -ENOMEM;
 		} else {
 			src->intr_handle = *intr_handle;
@@ -501,7 +499,7 @@ rte_intr_callback_unregister(const struct rte_intr_handle *intr_handle,
 			if (cb->cb_fn == cb_fn && (cb_arg == (void *)-1 ||
 					cb->cb_arg == cb_arg)) {
 				TAILQ_REMOVE(&src->callbacks, cb, next);
-				rte_free(cb);
+				free(cb);
 				ret++;
 			}
 		}
@@ -509,7 +507,7 @@ rte_intr_callback_unregister(const struct rte_intr_handle *intr_handle,
 		/* all callbacks for that source are removed. */
 		if (TAILQ_EMPTY(&src->callbacks)) {
 			TAILQ_REMOVE(&intr_sources, src, next);
-			rte_free(src);
+			free(src);
 		}
 	}
 
-- 
2.17.1

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [dpdk-dev] [PATCH v2 3/7] eal/bsdapp: add interrupt thread
  2018-06-15 14:25 [dpdk-dev] [PATCH 0/8] Remove IPC threads Anatoly Burakov
                   ` (12 preceding siblings ...)
  2018-06-26 10:53 ` [dpdk-dev] [PATCH v2 2/7] eal/linux: use glibc malloc in interrupt handling Anatoly Burakov
@ 2018-06-26 10:53 ` Anatoly Burakov
  2018-06-26 10:53 ` [dpdk-dev] [PATCH v2 4/7] eal/bsdapp: add alarm support Anatoly Burakov
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 25+ messages in thread
From: Anatoly Burakov @ 2018-06-26 10:53 UTC (permalink / raw)
  To: dev; +Cc: Bruce Richardson, konstantin.ananyev, thomas, qi.z.zhang

Add interrupt thread to FreeBSD. It is largely a copy-paste from
Linuxapp interrupt thread, except for a few key differences:

* Use kevent instead of epoll
* Do not recreate the event queue on adding/removing interrupt
  sources, add/remove them to/from the queue on the fly instead
* No support for UIO/VFIO handles

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
 lib/librte_eal/bsdapp/eal/eal_interrupts.c | 436 ++++++++++++++++++++-
 1 file changed, 418 insertions(+), 18 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal_interrupts.c b/lib/librte_eal/bsdapp/eal/eal_interrupts.c
index 290d53ab9..d0db6c6ff 100644
--- a/lib/librte_eal/bsdapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/bsdapp/eal/eal_interrupts.c
@@ -1,51 +1,451 @@
 /* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(c) 2010-2014 Intel Corporation
+ * Copyright(c) 2010-2018 Intel Corporation
  */
 
+#include <string.h>
+#include <sys/types.h>
+#include <sys/event.h>
+#include <sys/queue.h>
+#include <unistd.h>
+
+#include <rte_errno.h>
+#include <rte_interrupts.h>
+#include <rte_lcore.h>
+#include <rte_spinlock.h>
 #include <rte_common.h>
 #include <rte_interrupts.h>
+
 #include "eal_private.h"
 
+#define MAX_INTR_EVENTS 16
+
+/**
+ * union buffer for reading on different devices
+ */
+union rte_intr_read_buffer {
+	char charbuf[16];                /* for others */
+};
+
+TAILQ_HEAD(rte_intr_cb_list, rte_intr_callback);
+TAILQ_HEAD(rte_intr_source_list, rte_intr_source);
+
+struct rte_intr_callback {
+	TAILQ_ENTRY(rte_intr_callback) next;
+	rte_intr_callback_fn cb_fn;  /**< callback address */
+	void *cb_arg;                /**< parameter for callback */
+};
+
+struct rte_intr_source {
+	TAILQ_ENTRY(rte_intr_source) next;
+	struct rte_intr_handle intr_handle; /**< interrupt handle */
+	struct rte_intr_cb_list callbacks;  /**< user callbacks */
+	uint32_t active;
+};
+
+/* global spinlock for interrupt data operation */
+static rte_spinlock_t intr_lock = RTE_SPINLOCK_INITIALIZER;
+
+/* interrupt sources list */
+static struct rte_intr_source_list intr_sources;
+
+/* interrupt handling thread */
+static pthread_t intr_thread;
+
+static volatile int kq = -1;
+
+static int
+intr_source_to_kevent(const struct rte_intr_handle *ih, struct kevent *ke)
+{
+	ke->filter = EVFILT_READ;
+	ke->ident = ih->fd;
+
+	return 0;
+}
+
 int
 rte_intr_callback_register(const struct rte_intr_handle *intr_handle,
-			rte_intr_callback_fn cb,
-			void *cb_arg)
+		rte_intr_callback_fn cb, void *cb_arg)
 {
-	RTE_SET_USED(intr_handle);
-	RTE_SET_USED(cb);
-	RTE_SET_USED(cb_arg);
+	struct rte_intr_callback *callback = NULL;
+	struct rte_intr_source *src = NULL;
+	int ret, add_event;
 
-	return -ENOTSUP;
+	/* first do parameter checking */
+	if (intr_handle == NULL || intr_handle->fd < 0 || cb == NULL) {
+		RTE_LOG(ERR, EAL,
+			"Registering with invalid input parameter\n");
+		return -EINVAL;
+	}
+	if (kq < 0) {
+		RTE_LOG(ERR, EAL, "Kqueue is not active: %d\n", kq);
+		return -ENODEV;
+	}
+
+	/* allocate a new interrupt callback entity */
+	callback = calloc(1, sizeof(*callback));
+	if (callback == NULL) {
+		RTE_LOG(ERR, EAL, "Can not allocate memory\n");
+		return -ENOMEM;
+	}
+	callback->cb_fn = cb;
+	callback->cb_arg = cb_arg;
+
+	rte_spinlock_lock(&intr_lock);
+
+	/* check if there is at least one callback registered for the fd */
+	TAILQ_FOREACH(src, &intr_sources, next) {
+		if (src->intr_handle.fd == intr_handle->fd) {
+			/* we had no interrupts for this */
+			if (TAILQ_EMPTY(&src->callbacks))
+				add_event = 1;
+
+			TAILQ_INSERT_TAIL(&(src->callbacks), callback, next);
+			ret = 0;
+			break;
+		}
+	}
+
+	/* no existing callbacks for this - add new source */
+	if (src == NULL) {
+		src = calloc(1, sizeof(*src));
+		if (src == NULL) {
+			RTE_LOG(ERR, EAL, "Can not allocate memory\n");
+			ret = -ENOMEM;
+			goto fail;
+		} else {
+			src->intr_handle = *intr_handle;
+			TAILQ_INIT(&src->callbacks);
+			TAILQ_INSERT_TAIL(&(src->callbacks), callback, next);
+			TAILQ_INSERT_TAIL(&intr_sources, src, next);
+			add_event = 1;
+			ret = 0;
+		}
+	}
+
+	/* add events to the queue */
+	if (add_event) {
+		struct kevent ke;
+
+		memset(&ke, 0, sizeof(ke));
+		ke.flags = EV_ADD; /* mark for addition to the queue */
+
+		if (intr_source_to_kevent(intr_handle, &ke) < 0) {
+			RTE_LOG(ERR, EAL, "Cannot convert interrupt handle to kevent\n");
+			ret = -ENODEV;
+			goto fail;
+		}
+
+		/**
+		 * add the intr file descriptor into wait list.
+		 */
+		if (kevent(kq, &ke, 1, NULL, 0, NULL) < 0) {
+			RTE_LOG(ERR, EAL, "Error adding fd %d kevent, %s\n",
+				src->intr_handle.fd, strerror(errno));
+			ret = -errno;
+			goto fail;
+		}
+	}
+	rte_spinlock_unlock(&intr_lock);
+
+	return ret;
+fail:
+	/* clean up */
+	if (src != NULL) {
+		TAILQ_REMOVE(&(src->callbacks), callback, next);
+		if (TAILQ_EMPTY(&(src->callbacks))) {
+			TAILQ_REMOVE(&intr_sources, src, next);
+			free(src);
+		}
+	}
+	free(callback);
+	rte_spinlock_unlock(&intr_lock);
+	return ret;
 }
 
 int
 rte_intr_callback_unregister(const struct rte_intr_handle *intr_handle,
-			rte_intr_callback_fn cb,
-			void *cb_arg)
+		rte_intr_callback_fn cb_fn, void *cb_arg)
 {
-	RTE_SET_USED(intr_handle);
-	RTE_SET_USED(cb);
-	RTE_SET_USED(cb_arg);
+	int ret;
+	struct rte_intr_source *src;
+	struct rte_intr_callback *cb, *next;
 
-	return -ENOTSUP;
+	/* do parameter checking first */
+	if (intr_handle == NULL || intr_handle->fd < 0) {
+		RTE_LOG(ERR, EAL,
+		"Unregistering with invalid input parameter\n");
+		return -EINVAL;
+	}
+	if (kq < 0) {
+		RTE_LOG(ERR, EAL, "Kqueue is not active\n");
+		return -ENODEV;
+	}
+
+	rte_spinlock_lock(&intr_lock);
+
+	/* check if the insterrupt source for the fd is existent */
+	TAILQ_FOREACH(src, &intr_sources, next)
+		if (src->intr_handle.fd == intr_handle->fd)
+			break;
+
+	/* No interrupt source registered for the fd */
+	if (src == NULL) {
+		ret = -ENOENT;
+
+	/* interrupt source has some active callbacks right now. */
+	} else if (src->active != 0) {
+		ret = -EAGAIN;
+
+	/* ok to remove. */
+	} else {
+		struct kevent ke;
+
+		ret = 0;
+
+		/* remove it from the kqueue */
+		memset(&ke, 0, sizeof(ke));
+		ke.flags = EV_DELETE; /* mark for deletion from the queue */
+
+		if (intr_source_to_kevent(intr_handle, &ke) < 0) {
+			RTE_LOG(ERR, EAL, "Cannot convert to kevent\n");
+			ret = -ENODEV;
+			goto out;
+		}
+
+		/**
+		 * remove intr file descriptor from wait list.
+		 */
+		if (kevent(kq, &ke, 1, NULL, 0, NULL) < 0) {
+			RTE_LOG(ERR, EAL, "Error removing fd %d kevent, %s\n",
+				src->intr_handle.fd, strerror(errno));
+			ret = -errno;
+			goto out;
+		}
+
+		/*walk through the callbacks and remove all that match. */
+		for (cb = TAILQ_FIRST(&src->callbacks); cb != NULL; cb = next) {
+			next = TAILQ_NEXT(cb, next);
+			if (cb->cb_fn == cb_fn && (cb_arg == (void *)-1 ||
+					cb->cb_arg == cb_arg)) {
+				TAILQ_REMOVE(&src->callbacks, cb, next);
+				free(cb);
+				ret++;
+			}
+		}
+
+		/* all callbacks for that source are removed. */
+		if (TAILQ_EMPTY(&src->callbacks)) {
+			TAILQ_REMOVE(&intr_sources, src, next);
+			free(src);
+		}
+	}
+out:
+	rte_spinlock_unlock(&intr_lock);
+
+	return ret;
 }
 
 int
-rte_intr_enable(const struct rte_intr_handle *intr_handle __rte_unused)
+rte_intr_enable(const struct rte_intr_handle *intr_handle)
 {
-	return -ENOTSUP;
+	if (intr_handle && intr_handle->type == RTE_INTR_HANDLE_VDEV)
+		return 0;
+
+	if (!intr_handle || intr_handle->fd < 0 || intr_handle->uio_cfg_fd < 0)
+		return -1;
+
+	switch (intr_handle->type) {
+	/* not used at this moment */
+	case RTE_INTR_HANDLE_ALARM:
+		return -1;
+	/* not used at this moment */
+	case RTE_INTR_HANDLE_DEV_EVENT:
+		return -1;
+	/* unknown handle type */
+	default:
+		RTE_LOG(ERR, EAL,
+			"Unknown handle type of fd %d\n",
+					intr_handle->fd);
+		return -1;
+	}
+
+	return 0;
 }
 
 int
-rte_intr_disable(const struct rte_intr_handle *intr_handle __rte_unused)
+rte_intr_disable(const struct rte_intr_handle *intr_handle)
 {
-	return -ENOTSUP;
+	if (intr_handle && intr_handle->type == RTE_INTR_HANDLE_VDEV)
+		return 0;
+
+	if (!intr_handle || intr_handle->fd < 0 || intr_handle->uio_cfg_fd < 0)
+		return -1;
+
+	switch (intr_handle->type) {
+	/* not used at this moment */
+	case RTE_INTR_HANDLE_ALARM:
+		return -1;
+	/* not used at this moment */
+	case RTE_INTR_HANDLE_DEV_EVENT:
+		return -1;
+	/* unknown handle type */
+	default:
+		RTE_LOG(ERR, EAL,
+			"Unknown handle type of fd %d\n",
+					intr_handle->fd);
+		return -1;
+	}
+
+	return 0;
+}
+
+static void
+eal_intr_process_interrupts(struct kevent *events, int nfds)
+{
+	struct rte_intr_callback active_cb;
+	union rte_intr_read_buffer buf;
+	struct rte_intr_callback *cb;
+	struct rte_intr_source *src;
+	bool call = false;
+	int n, bytes_read;
+
+	for (n = 0; n < nfds; n++) {
+		int event_fd = events[n].ident;
+
+		rte_spinlock_lock(&intr_lock);
+		TAILQ_FOREACH(src, &intr_sources, next)
+			if (src->intr_handle.fd == event_fd)
+				break;
+		if (src == NULL) {
+			rte_spinlock_unlock(&intr_lock);
+			continue;
+		}
+
+		/* mark this interrupt source as active and release the lock. */
+		src->active = 1;
+		rte_spinlock_unlock(&intr_lock);
+
+		/* set the length to be read dor different handle type */
+		switch (src->intr_handle.type) {
+		case RTE_INTR_HANDLE_ALARM:
+			bytes_read = 0;
+			call = true;
+			break;
+		case RTE_INTR_HANDLE_VDEV:
+		case RTE_INTR_HANDLE_EXT:
+			bytes_read = 0;
+			call = true;
+			break;
+		case RTE_INTR_HANDLE_DEV_EVENT:
+			bytes_read = 0;
+			call = true;
+			break;
+		default:
+			bytes_read = 1;
+			break;
+		}
+
+		if (bytes_read > 0) {
+			/**
+			 * read out to clear the ready-to-be-read flag
+			 * for epoll_wait.
+			 */
+			bytes_read = read(event_fd, &buf, bytes_read);
+			if (bytes_read < 0) {
+				if (errno == EINTR || errno == EWOULDBLOCK)
+					continue;
+
+				RTE_LOG(ERR, EAL, "Error reading from file "
+					"descriptor %d: %s\n",
+					event_fd,
+					strerror(errno));
+			} else if (bytes_read == 0)
+				RTE_LOG(ERR, EAL, "Read nothing from file "
+					"descriptor %d\n", event_fd);
+			else
+				call = true;
+		}
+
+		/* grab a lock, again to call callbacks and update status. */
+		rte_spinlock_lock(&intr_lock);
+
+		if (call) {
+			/* Finally, call all callbacks. */
+			TAILQ_FOREACH(cb, &src->callbacks, next) {
+
+				/* make a copy and unlock. */
+				active_cb = *cb;
+				rte_spinlock_unlock(&intr_lock);
+
+				/* call the actual callback */
+				active_cb.cb_fn(active_cb.cb_arg);
+
+				/*get the lock back. */
+				rte_spinlock_lock(&intr_lock);
+			}
+		}
+
+		/* we done with that interrupt source, release it. */
+		src->active = 0;
+		rte_spinlock_unlock(&intr_lock);
+	}
+}
+
+static void *
+eal_intr_thread_main(void *arg __rte_unused)
+{
+	struct kevent events[MAX_INTR_EVENTS];
+	int nfds;
+
+	/* host thread, never break out */
+	for (;;) {
+		/* do not change anything, just wait */
+		nfds = kevent(kq, NULL, 0, events, MAX_INTR_EVENTS, NULL);
+
+		/* kevent fail */
+		if (nfds < 0) {
+			if (errno == EINTR)
+				continue;
+			RTE_LOG(ERR, EAL,
+				"kevent returns with fail\n");
+			break;
+		}
+		/* kevent timeout, will never happen here */
+		else if (nfds == 0)
+			continue;
+
+		/* kevent has at least one fd ready to read */
+		eal_intr_process_interrupts(events, nfds);
+	}
+	close(kq);
+	kq = -1;
+	return NULL;
 }
 
 int
 rte_eal_intr_init(void)
 {
-	return 0;
+	int ret = 0;
+
+	/* init the global interrupt source head */
+	TAILQ_INIT(&intr_sources);
+
+	kq = kqueue();
+	if (kq < 0) {
+		RTE_LOG(ERR, EAL, "Cannot create kqueue instance\n");
+		return -1;
+	}
+
+	/* create the host thread to wait/handle the interrupt */
+	ret = rte_ctrl_thread_create(&intr_thread, "eal-intr-thread", NULL,
+			eal_intr_thread_main, NULL);
+	if (ret != 0) {
+		rte_errno = -ret;
+		RTE_LOG(ERR, EAL,
+			"Failed to create thread for interrupt handling\n");
+	}
+
+	return ret;
 }
 
 int
-- 
2.17.1

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [dpdk-dev] [PATCH v2 4/7] eal/bsdapp: add alarm support
  2018-06-15 14:25 [dpdk-dev] [PATCH 0/8] Remove IPC threads Anatoly Burakov
                   ` (13 preceding siblings ...)
  2018-06-26 10:53 ` [dpdk-dev] [PATCH v2 3/7] eal/bsdapp: add interrupt thread Anatoly Burakov
@ 2018-06-26 10:53 ` Anatoly Burakov
  2018-06-26 10:53 ` [dpdk-dev] [PATCH v2 5/7] eal: bring forward init of interrupt handling Anatoly Burakov
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 25+ messages in thread
From: Anatoly Burakov @ 2018-06-26 10:53 UTC (permalink / raw)
  To: dev; +Cc: Bruce Richardson, konstantin.ananyev, thomas, qi.z.zhang

Implement EAL alarm API support for FreeBSD. The implementation
is largely identical to that of Linux version, with one key
difference.

The alarm API is a little Linux-centric in that it is expecting
the alarm API to manage alarm timeouts without involvement of the
interrupt thread. This works on Linux because in Linux, there's
timerfd API which allows waiting for timer events on an fd.

On FreeBSD, however, there are no timerfd's, and timer events are
set up directly in kevent. There is no way to pass information from
the alarm API to the interrupt thread, so we also add a little
back-channel magic to get soonest alarm timeout from the alarm API.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
 lib/librte_eal/bsdapp/eal/eal_alarm.c         | 299 +++++++++++++++++-
 lib/librte_eal/bsdapp/eal/eal_alarm_private.h |  19 ++
 lib/librte_eal/bsdapp/eal/eal_interrupts.c    |  29 +-
 3 files changed, 334 insertions(+), 13 deletions(-)
 create mode 100644 lib/librte_eal/bsdapp/eal/eal_alarm_private.h

diff --git a/lib/librte_eal/bsdapp/eal/eal_alarm.c b/lib/librte_eal/bsdapp/eal/eal_alarm.c
index eb3913c97..51ea4b8c0 100644
--- a/lib/librte_eal/bsdapp/eal/eal_alarm.c
+++ b/lib/librte_eal/bsdapp/eal/eal_alarm.c
@@ -1,31 +1,314 @@
 /* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(c) 2010-2014 Intel Corporation
+ * Copyright(c) 2010-2018 Intel Corporation
  */
+
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <stdio.h>
 #include <stdlib.h>
+#include <string.h>
+#include <time.h>
 #include <errno.h>
 
 #include <rte_alarm.h>
+#include <rte_cycles.h>
 #include <rte_common.h>
+#include <rte_errno.h>
+#include <rte_interrupts.h>
+#include <rte_spinlock.h>
+
 #include "eal_private.h"
+#include "eal_alarm_private.h"
+
+#define NS_PER_US 1000
+
+#ifdef CLOCK_MONOTONIC_RAW /* Defined in glibc bits/time.h */
+#define CLOCK_TYPE_ID CLOCK_MONOTONIC_RAW
+#else
+#define CLOCK_TYPE_ID CLOCK_MONOTONIC
+#endif
+
+struct alarm_entry {
+	LIST_ENTRY(alarm_entry) next;
+	struct rte_intr_handle handle;
+	struct timespec time;
+	rte_eal_alarm_callback cb_fn;
+	void *cb_arg;
+	volatile uint8_t executing;
+	volatile pthread_t executing_id;
+};
+
+static LIST_HEAD(alarm_list, alarm_entry) alarm_list = LIST_HEAD_INITIALIZER();
+static rte_spinlock_t alarm_list_lk = RTE_SPINLOCK_INITIALIZER;
+
+static struct rte_intr_handle intr_handle = {.fd = -1 };
+static void eal_alarm_callback(void *arg);
 
 int
 rte_eal_alarm_init(void)
 {
+	intr_handle.type = RTE_INTR_HANDLE_ALARM;
+
+	/* on FreeBSD, timers don't use fd's, and their identifiers are stored
+	 * in separate namespace from fd's, so using any value is OK. however,
+	 * EAL interrupts handler expects fd's to be unique, so use an actual fd
+	 * to guarantee unique timer identifier.
+	 */
+	intr_handle.fd = open("/dev/zero", O_RDONLY);
+
+	return 0;
+}
+
+static inline int
+timespec_cmp(const struct timespec *now, const struct timespec *at)
+{
+	if (now->tv_sec < at->tv_sec)
+		return -1;
+	if (now->tv_sec > at->tv_sec)
+		return 1;
+	if (now->tv_nsec < at->tv_nsec)
+		return -1;
+	if (now->tv_nsec > at->tv_nsec)
+		return 1;
+	return 0;
+}
+
+static inline uint64_t
+diff_ns(struct timespec *now, struct timespec *at)
+{
+	uint64_t now_ns, at_ns;
+
+	if (timespec_cmp(now, at) >= 0)
+		return 0;
+
+	now_ns = now->tv_sec * NS_PER_S + now->tv_nsec;
+	at_ns = at->tv_sec * NS_PER_S + at->tv_nsec;
+
+	return at_ns - now_ns;
+}
+
+int
+eal_alarm_get_timeout_ns(uint64_t *val)
+{
+	struct alarm_entry *ap;
+	struct timespec now;
+
+	if (clock_gettime(CLOCK_TYPE_ID, &now) < 0)
+		return -1;
+
+	if (LIST_EMPTY(&alarm_list))
+		return -1;
+
+	ap = LIST_FIRST(&alarm_list);
+
+	*val = diff_ns(&now, &ap->time);
+
 	return 0;
 }
 
+static int
+unregister_current_callback(void)
+{
+	struct alarm_entry *ap;
+	int ret = 0;
+
+	if (!LIST_EMPTY(&alarm_list)) {
+		ap = LIST_FIRST(&alarm_list);
+
+		do {
+			ret = rte_intr_callback_unregister(&intr_handle,
+				eal_alarm_callback, &ap->time);
+		} while (ret == -EAGAIN);
+	}
+
+	return ret;
+}
+
+static int
+register_first_callback(void)
+{
+	struct alarm_entry *ap;
+	int ret = 0;
+
+	if (!LIST_EMPTY(&alarm_list)) {
+		ap = LIST_FIRST(&alarm_list);
+
+		/* register a new callback */
+		ret = rte_intr_callback_register(&intr_handle,
+				eal_alarm_callback, &ap->time);
+	}
+	return ret;
+}
+
+static void
+eal_alarm_callback(void *arg __rte_unused)
+{
+	struct timespec now;
+	struct alarm_entry *ap;
+
+	rte_spinlock_lock(&alarm_list_lk);
+	ap = LIST_FIRST(&alarm_list);
+
+	if (clock_gettime(CLOCK_TYPE_ID, &now) < 0)
+		return;
+
+	while (ap != NULL && timespec_cmp(&now, &ap->time) >= 0) {
+		ap->executing = 1;
+		ap->executing_id = pthread_self();
+		rte_spinlock_unlock(&alarm_list_lk);
+
+		ap->cb_fn(ap->cb_arg);
+
+		rte_spinlock_lock(&alarm_list_lk);
+
+		LIST_REMOVE(ap, next);
+		free(ap);
+
+		ap = LIST_FIRST(&alarm_list);
+	}
+
+	/* timer has been deleted from the kqueue, so recreate it if needed */
+	register_first_callback();
+
+	rte_spinlock_unlock(&alarm_list_lk);
+}
+
 
 int
-rte_eal_alarm_set(uint64_t us __rte_unused,
-		rte_eal_alarm_callback cb_fn __rte_unused,
-		void *cb_arg __rte_unused)
+rte_eal_alarm_set(uint64_t us, rte_eal_alarm_callback cb_fn, void *cb_arg)
 {
-	return -ENOTSUP;
+	struct alarm_entry *ap, *new_alarm;
+	struct timespec now;
+	uint64_t ns;
+	int ret = 0;
+
+	/* check parameters, also ensure us won't cause a uint64_t overflow */
+	if (us < 1 || us > (UINT64_MAX - US_PER_S) || cb_fn == NULL)
+		return -EINVAL;
+
+	new_alarm = calloc(1, sizeof(*new_alarm));
+	if (new_alarm == NULL)
+		return -ENOMEM;
+
+	/* use current time to calculate absolute time of alarm */
+	clock_gettime(CLOCK_TYPE_ID, &now);
+
+	ns = us * NS_PER_US;
+
+	new_alarm->cb_fn = cb_fn;
+	new_alarm->cb_arg = cb_arg;
+	new_alarm->time.tv_nsec = (now.tv_nsec + ns) % NS_PER_S;
+	new_alarm->time.tv_sec = now.tv_sec + ((now.tv_nsec + ns) / NS_PER_S);
+
+	rte_spinlock_lock(&alarm_list_lk);
+
+	if (LIST_EMPTY(&alarm_list))
+		LIST_INSERT_HEAD(&alarm_list, new_alarm, next);
+	else {
+		LIST_FOREACH(ap, &alarm_list, next) {
+			if (timespec_cmp(&new_alarm->time, &ap->time) < 0) {
+				LIST_INSERT_BEFORE(ap, new_alarm, next);
+				break;
+			}
+			if (LIST_NEXT(ap, next) == NULL) {
+				LIST_INSERT_AFTER(ap, new_alarm, next);
+				break;
+			}
+		}
+	}
+
+	/* re-register first callback just in case */
+	register_first_callback();
+
+	rte_spinlock_unlock(&alarm_list_lk);
+
+	return ret;
 }
 
 int
-rte_eal_alarm_cancel(rte_eal_alarm_callback cb_fn __rte_unused,
-		void *cb_arg __rte_unused)
+rte_eal_alarm_cancel(rte_eal_alarm_callback cb_fn, void *cb_arg)
 {
-	return -ENOTSUP;
+	struct alarm_entry *ap, *ap_prev;
+	int count = 0;
+	int err = 0;
+	int executing;
+
+	if (!cb_fn) {
+		rte_errno = EINVAL;
+		return -1;
+	}
+
+	do {
+		executing = 0;
+		rte_spinlock_lock(&alarm_list_lk);
+		/* remove any matches at the start of the list */
+		while (1) {
+			ap = LIST_FIRST(&alarm_list);
+			if (ap == NULL)
+				break;
+			if (cb_fn != ap->cb_fn)
+				break;
+			if (cb_arg != ap->cb_arg && cb_arg != (void *) -1)
+				break;
+			if (ap->executing == 0) {
+				LIST_REMOVE(ap, next);
+				free(ap);
+				count++;
+			} else {
+				/* If calling from other context, mark that
+				 * alarm is executing so loop can spin till it
+				 * finish. Otherwise we are trying to cancel
+				 * ourselves - mark it by EINPROGRESS.
+				 */
+				if (pthread_equal(ap->executing_id,
+						pthread_self()) == 0)
+					executing++;
+				else
+					err = EINPROGRESS;
+
+				break;
+			}
+		}
+		ap_prev = ap;
+
+		/* now go through list, removing entries not at start */
+		LIST_FOREACH(ap, &alarm_list, next) {
+			/* this won't be true first time through */
+			if (cb_fn == ap->cb_fn &&
+					(cb_arg == (void *)-1 ||
+					 cb_arg == ap->cb_arg)) {
+				if (ap->executing == 0) {
+					LIST_REMOVE(ap, next);
+					free(ap);
+					count++;
+					ap = ap_prev;
+				} else if (pthread_equal(ap->executing_id,
+							 pthread_self()) == 0) {
+					executing++;
+				} else {
+					err = EINPROGRESS;
+				}
+			}
+			ap_prev = ap;
+		}
+		rte_spinlock_unlock(&alarm_list_lk);
+	} while (executing != 0);
+
+	if (count == 0 && err == 0)
+		rte_errno = ENOENT;
+	else if (err)
+		rte_errno = err;
+
+	rte_spinlock_lock(&alarm_list_lk);
+
+	/* unregister if no alarms left, otherwise re-register first */
+	if (LIST_EMPTY(&alarm_list))
+		unregister_current_callback();
+	else
+		register_first_callback();
+
+	rte_spinlock_unlock(&alarm_list_lk);
+
+	return count;
 }
diff --git a/lib/librte_eal/bsdapp/eal/eal_alarm_private.h b/lib/librte_eal/bsdapp/eal/eal_alarm_private.h
new file mode 100644
index 000000000..65c711518
--- /dev/null
+++ b/lib/librte_eal/bsdapp/eal/eal_alarm_private.h
@@ -0,0 +1,19 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#ifndef EAL_ALARM_PRIVATE_H
+#define EAL_ALARM_PRIVATE_H
+
+#include <inttypes.h>
+
+/*
+ * FreeBSD needs a back-channel communication mechanism between interrupt and
+ * alarm thread, because on FreeBSD, timer period is set up inside the interrupt
+ * API and not inside alarm API like on Linux.
+ */
+
+int
+eal_alarm_get_timeout_ns(uint64_t *val);
+
+#endif // EAL_ALARM_PRIVATE_H
diff --git a/lib/librte_eal/bsdapp/eal/eal_interrupts.c b/lib/librte_eal/bsdapp/eal/eal_interrupts.c
index d0db6c6ff..a4aac606e 100644
--- a/lib/librte_eal/bsdapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/bsdapp/eal/eal_interrupts.c
@@ -16,6 +16,7 @@
 #include <rte_interrupts.h>
 
 #include "eal_private.h"
+#include "eal_alarm_private.h"
 
 #define MAX_INTR_EVENTS 16
 
@@ -56,7 +57,22 @@ static volatile int kq = -1;
 static int
 intr_source_to_kevent(const struct rte_intr_handle *ih, struct kevent *ke)
 {
-	ke->filter = EVFILT_READ;
+	/* alarm callbacks are special case */
+	if (ih->type == RTE_INTR_HANDLE_ALARM) {
+		uint64_t timeout_ns;
+
+		/* get soonest alarm timeout */
+		if (eal_alarm_get_timeout_ns(&timeout_ns) < 0)
+			return -1;
+
+		ke->filter = EVFILT_TIMER;
+		/* timers are one shot */
+		ke->flags |= EV_ONESHOT;
+		ke->fflags = NOTE_NSECONDS;
+		ke->data = timeout_ns;
+	} else {
+		ke->filter = EVFILT_READ;
+	}
 	ke->ident = ih->fd;
 
 	return 0;
@@ -122,8 +138,10 @@ rte_intr_callback_register(const struct rte_intr_handle *intr_handle,
 		}
 	}
 
-	/* add events to the queue */
-	if (add_event) {
+	/* add events to the queue. timer events are special as we need to
+	 * re-set the timer.
+	 */
+	if (add_event || src->intr_handle.type == RTE_INTR_HANDLE_ALARM) {
 		struct kevent ke;
 
 		memset(&ke, 0, sizeof(ke));
@@ -218,8 +236,9 @@ rte_intr_callback_unregister(const struct rte_intr_handle *intr_handle,
 		if (kevent(kq, &ke, 1, NULL, 0, NULL) < 0) {
 			RTE_LOG(ERR, EAL, "Error removing fd %d kevent, %s\n",
 				src->intr_handle.fd, strerror(errno));
-			ret = -errno;
-			goto out;
+			/* removing non-existent even is an expected condition
+			 * in some circumstances (e.g. oneshot events).
+			 */
 		}
 
 		/*walk through the callbacks and remove all that match. */
-- 
2.17.1

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [dpdk-dev] [PATCH v2 5/7] eal: bring forward init of interrupt handling
  2018-06-15 14:25 [dpdk-dev] [PATCH 0/8] Remove IPC threads Anatoly Burakov
                   ` (14 preceding siblings ...)
  2018-06-26 10:53 ` [dpdk-dev] [PATCH v2 4/7] eal/bsdapp: add alarm support Anatoly Burakov
@ 2018-06-26 10:53 ` Anatoly Burakov
  2018-07-12 22:36   ` Thomas Monjalon
  2018-06-26 10:53 ` [dpdk-dev] [PATCH v2 6/7] ipc: remove IPC thread for async requests Anatoly Burakov
  2018-06-26 10:53 ` [dpdk-dev] [PATCH v2 7/7] doc: document IPC callback limitations Anatoly Burakov
  17 siblings, 1 reply; 25+ messages in thread
From: Anatoly Burakov @ 2018-06-26 10:53 UTC (permalink / raw)
  To: dev
  Cc: Jianfeng Tan, Bruce Richardson, konstantin.ananyev, thomas, qi.z.zhang

From: Jianfeng Tan <jianfeng.tan@intel.com>

Next commit will make asynchronous IPC requests rely on alarm API,
which in turn relies on interrupts to work. Therefore, move the EAL
interrupt initialization before IPC initialization to avoid breaking
IPC in the next commit.

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
 lib/librte_eal/bsdapp/eal/eal.c   | 10 +++++-----
 lib/librte_eal/linuxapp/eal/eal.c | 10 +++++-----
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
index dc279542d..f70f7aecd 100644
--- a/lib/librte_eal/bsdapp/eal/eal.c
+++ b/lib/librte_eal/bsdapp/eal/eal.c
@@ -625,6 +625,11 @@ rte_eal_init(int argc, char **argv)
 
 	rte_config_init();
 
+	if (rte_eal_intr_init() < 0) {
+		rte_eal_init_alert("Cannot init interrupt-handling thread\n");
+		return -1;
+	}
+
 	/* Put mp channel init before bus scan so that we can init the vdev
 	 * bus through mp channel in the secondary process before the bus scan.
 	 */
@@ -713,11 +718,6 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
-	if (rte_eal_intr_init() < 0) {
-		rte_eal_init_alert("Cannot init interrupt-handling thread\n");
-		return -1;
-	}
-
 	if (rte_eal_timer_init() < 0) {
 		rte_eal_init_alert("Cannot init HPET or TSC timers\n");
 		rte_errno = ENOTSUP;
diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 8655b8691..f8a0c06d7 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -839,6 +839,11 @@ rte_eal_init(int argc, char **argv)
 
 	rte_config_init();
 
+	if (rte_eal_intr_init() < 0) {
+		rte_eal_init_alert("Cannot init interrupt-handling thread\n");
+		return -1;
+	}
+
 	/* Put mp channel init before bus scan so that we can init the vdev
 	 * bus through mp channel in the secondary process before the bus scan.
 	 */
@@ -968,11 +973,6 @@ rte_eal_init(int argc, char **argv)
 		rte_config.master_lcore, (int)thread_id, cpuset,
 		ret == 0 ? "" : "...");
 
-	if (rte_eal_intr_init() < 0) {
-		rte_eal_init_alert("Cannot init interrupt-handling thread\n");
-		return -1;
-	}
-
 	RTE_LCORE_FOREACH_SLAVE(i) {
 
 		/*
-- 
2.17.1

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [dpdk-dev] [PATCH v2 6/7] ipc: remove IPC thread for async requests
  2018-06-15 14:25 [dpdk-dev] [PATCH 0/8] Remove IPC threads Anatoly Burakov
                   ` (15 preceding siblings ...)
  2018-06-26 10:53 ` [dpdk-dev] [PATCH v2 5/7] eal: bring forward init of interrupt handling Anatoly Burakov
@ 2018-06-26 10:53 ` Anatoly Burakov
  2018-06-26 10:53 ` [dpdk-dev] [PATCH v2 7/7] doc: document IPC callback limitations Anatoly Burakov
  17 siblings, 0 replies; 25+ messages in thread
From: Anatoly Burakov @ 2018-06-26 10:53 UTC (permalink / raw)
  To: dev
  Cc: konstantin.ananyev, thomas, bruce.richardson, qi.z.zhang, Jianfeng Tan

Previously, we were using two IPC threads - one to handle messages
and synchronous requests, and another to handle asynchronous requests.
To handle replies for an async request, rte_mp_handle woke up the
rte_mp_handle_async thread to process through pthread_cond variable.

Change it to handle asynchronous messages within the main IPC thread.
To handle timeout events, for each async request which is sent,
we set an alarm for it. If its reply is received before timeout,
we will cancel the alarm when we handle the reply; otherwise,
alarm will invoke the async_reply_handle() as the alarm callback.

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Suggested-by: Thomas Monjalon <thomas@monjalon.net>
---

Notes:
    RFC->RFCv2:
    - Rebased on latest code
    - Implemented comments to the original RFC

 lib/librte_eal/common/eal_common_proc.c | 201 +++++++++---------------
 1 file changed, 70 insertions(+), 131 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_proc.c b/lib/librte_eal/common/eal_common_proc.c
index 707d8ab30..6f3366403 100644
--- a/lib/librte_eal/common/eal_common_proc.c
+++ b/lib/librte_eal/common/eal_common_proc.c
@@ -20,6 +20,7 @@
 #include <sys/un.h>
 #include <unistd.h>
 
+#include <rte_alarm.h>
 #include <rte_common.h>
 #include <rte_cycles.h>
 #include <rte_eal.h>
@@ -94,11 +95,9 @@ TAILQ_HEAD(pending_request_list, pending_request);
 static struct {
 	struct pending_request_list requests;
 	pthread_mutex_t lock;
-	pthread_cond_t async_cond;
 } pending_requests = {
 	.requests = TAILQ_HEAD_INITIALIZER(pending_requests.requests),
 	.lock = PTHREAD_MUTEX_INITIALIZER,
-	.async_cond = PTHREAD_COND_INITIALIZER
 	/**< used in async requests only */
 };
 
@@ -106,6 +105,16 @@ static struct {
 static int
 mp_send(struct rte_mp_msg *msg, const char *peer, int type);
 
+/* for use with alarm callback */
+static void
+async_reply_handle(void *arg);
+
+/* for use with process_msg */
+static struct pending_request *
+async_reply_handle_thread_unsafe(void *arg);
+
+static void
+trigger_async_action(struct pending_request *req);
 
 static struct pending_request *
 find_pending_request(const char *dst, const char *act_name)
@@ -290,6 +299,8 @@ process_msg(struct mp_msg_internal *m, struct sockaddr_un *s)
 	RTE_LOG(DEBUG, EAL, "msg: %s\n", msg->name);
 
 	if (m->type == MP_REP || m->type == MP_IGN) {
+		struct pending_request *req = NULL;
+
 		pthread_mutex_lock(&pending_requests.lock);
 		pending_req = find_pending_request(s->sun_path, msg->name);
 		if (pending_req) {
@@ -301,11 +312,14 @@ process_msg(struct mp_msg_internal *m, struct sockaddr_un *s)
 			if (pending_req->type == REQUEST_TYPE_SYNC)
 				pthread_cond_signal(&pending_req->sync.cond);
 			else if (pending_req->type == REQUEST_TYPE_ASYNC)
-				pthread_cond_signal(
-					&pending_requests.async_cond);
+				req = async_reply_handle_thread_unsafe(
+						pending_req);
 		} else
 			RTE_LOG(ERR, EAL, "Drop mp reply: %s\n", msg->name);
 		pthread_mutex_unlock(&pending_requests.lock);
+
+		if (req != NULL)
+			trigger_async_action(req);
 		return;
 	}
 
@@ -365,7 +379,6 @@ timespec_cmp(const struct timespec *a, const struct timespec *b)
 }
 
 enum async_action {
-	ACTION_NONE, /**< don't do anything */
 	ACTION_FREE, /**< free the action entry, but don't trigger callback */
 	ACTION_TRIGGER /**< trigger callback, then free action entry */
 };
@@ -375,7 +388,7 @@ process_async_request(struct pending_request *sr, const struct timespec *now)
 {
 	struct async_request_param *param;
 	struct rte_mp_reply *reply;
-	bool timeout, received, last_msg;
+	bool timeout, last_msg;
 
 	param = sr->async.param;
 	reply = &param->user_reply;
@@ -383,13 +396,6 @@ process_async_request(struct pending_request *sr, const struct timespec *now)
 	/* did we timeout? */
 	timeout = timespec_cmp(&param->end, now) <= 0;
 
-	/* did we receive a response? */
-	received = sr->reply_received != 0;
-
-	/* if we didn't time out, and we didn't receive a response, ignore */
-	if (!timeout && !received)
-		return ACTION_NONE;
-
 	/* if we received a response, adjust relevant data and copy mesasge. */
 	if (sr->reply_received == 1 && sr->reply) {
 		struct rte_mp_msg *msg, *user_msgs, *tmp;
@@ -448,120 +454,60 @@ trigger_async_action(struct pending_request *sr)
 	free(sr->async.param->user_reply.msgs);
 	free(sr->async.param);
 	free(sr->request);
+	free(sr);
 }
 
 static struct pending_request *
-check_trigger(struct timespec *ts)
+async_reply_handle_thread_unsafe(void *arg)
 {
-	struct pending_request *next, *cur, *trigger = NULL;
-
-	TAILQ_FOREACH_SAFE(cur, &pending_requests.requests, next, next) {
-		enum async_action action;
-		if (cur->type != REQUEST_TYPE_ASYNC)
-			continue;
-
-		action = process_async_request(cur, ts);
-		if (action == ACTION_FREE) {
-			TAILQ_REMOVE(&pending_requests.requests, cur, next);
-			free(cur);
-		} else if (action == ACTION_TRIGGER) {
-			TAILQ_REMOVE(&pending_requests.requests, cur, next);
-			trigger = cur;
-			break;
-		}
-	}
-	return trigger;
-}
-
-static void
-wait_for_async_messages(void)
-{
-	struct pending_request *sr;
-	struct timespec timeout;
-	bool timedwait = false;
-	bool nowait = false;
-	int ret;
-
-	/* scan through the list and see if there are any timeouts that
-	 * are earlier than our current timeout.
-	 */
-	TAILQ_FOREACH(sr, &pending_requests.requests, next) {
-		if (sr->type != REQUEST_TYPE_ASYNC)
-			continue;
-		if (!timedwait || timespec_cmp(&sr->async.param->end,
-				&timeout) < 0) {
-			memcpy(&timeout, &sr->async.param->end,
-				sizeof(timeout));
-			timedwait = true;
-		}
-
-		/* sometimes, we don't even wait */
-		if (sr->reply_received) {
-			nowait = true;
-			break;
-		}
-	}
-
-	if (nowait)
-		return;
-
-	do {
-		ret = timedwait ?
-			pthread_cond_timedwait(
-				&pending_requests.async_cond,
-				&pending_requests.lock,
-				&timeout) :
-			pthread_cond_wait(
-				&pending_requests.async_cond,
-				&pending_requests.lock);
-	} while (ret != 0 && ret != ETIMEDOUT);
-
-	/* we've been woken up or timed out */
-}
-
-static void *
-async_reply_handle(void *arg __rte_unused)
-{
-	struct timeval now;
+	struct pending_request *req = (struct pending_request *)arg;
+	enum async_action action;
 	struct timespec ts_now;
-	while (1) {
-		struct pending_request *trigger = NULL;
+	struct timeval now;
 
-		pthread_mutex_lock(&pending_requests.lock);
+	if (gettimeofday(&now, NULL) < 0) {
+		RTE_LOG(ERR, EAL, "Cannot get current time\n");
+		goto no_trigger;
+	}
+	ts_now.tv_nsec = now.tv_usec * 1000;
+	ts_now.tv_sec = now.tv_sec;
 
-		/* we exit this function holding the lock */
-		wait_for_async_messages();
+	action = process_async_request(req, &ts_now);
 
-		if (gettimeofday(&now, NULL) < 0) {
-			pthread_mutex_unlock(&pending_requests.lock);
-			RTE_LOG(ERR, EAL, "Cannot get current time\n");
-			break;
+	TAILQ_REMOVE(&pending_requests.requests, req, next);
+
+	if (rte_eal_alarm_cancel(async_reply_handle, req) < 0) {
+		/* if we failed to cancel the alarm because it's already in
+		 * progress, don't proceed because otherwise we will end up
+		 * handling the same message twice.
+		 */
+		if (rte_errno == EINPROGRESS) {
+			RTE_LOG(DEBUG, EAL, "Request handling is already in progress\n");
+			goto no_trigger;
 		}
-		ts_now.tv_nsec = now.tv_usec * 1000;
-		ts_now.tv_sec = now.tv_sec;
-
-		do {
-			trigger = check_trigger(&ts_now);
-			/* unlock request list */
-			pthread_mutex_unlock(&pending_requests.lock);
-
-			if (trigger) {
-				trigger_async_action(trigger);
-				free(trigger);
-
-				/* we've triggered a callback, but there may be
-				 * more, so lock the list and check again.
-				 */
-				pthread_mutex_lock(&pending_requests.lock);
-			}
-		} while (trigger);
+		RTE_LOG(ERR, EAL, "Failed to cancel alarm\n");
 	}
 
-	RTE_LOG(ERR, EAL, "ERROR: asynchronous requests disabled\n");
-
+	if (action == ACTION_TRIGGER)
+		return req;
+no_trigger:
+	free(req);
 	return NULL;
 }
 
+static void
+async_reply_handle(void *arg)
+{
+	struct pending_request *req;
+
+	pthread_mutex_lock(&pending_requests.lock);
+	req = async_reply_handle_thread_unsafe(arg);
+	pthread_mutex_unlock(&pending_requests.lock);
+
+	if (req != NULL)
+		trigger_async_action(req);
+}
+
 static int
 open_socket_fd(void)
 {
@@ -624,7 +570,7 @@ rte_mp_channel_init(void)
 {
 	char path[PATH_MAX];
 	int dir_fd;
-	pthread_t mp_handle_tid, async_reply_handle_tid;
+	pthread_t mp_handle_tid;
 
 	/* create filter path */
 	create_socket_path("*", path, sizeof(path));
@@ -671,17 +617,6 @@ rte_mp_channel_init(void)
 		return -1;
 	}
 
-	if (rte_ctrl_thread_create(&async_reply_handle_tid,
-			"rte_mp_async", NULL,
-			async_reply_handle, NULL) < 0) {
-		RTE_LOG(ERR, EAL, "failed to create mp thead: %s\n",
-			strerror(errno));
-		close(mp_fd);
-		close(dir_fd);
-		mp_fd = -1;
-		return -1;
-	}
-
 	/* unlock the directory */
 	flock(dir_fd, LOCK_UN);
 	close(dir_fd);
@@ -853,7 +788,7 @@ rte_mp_sendmsg(struct rte_mp_msg *msg)
 
 static int
 mp_request_async(const char *dst, struct rte_mp_msg *req,
-		struct async_request_param *param)
+		struct async_request_param *param, const struct timespec *ts)
 {
 	struct rte_mp_msg *reply_msg;
 	struct pending_request *pending_req, *exist;
@@ -898,6 +833,13 @@ mp_request_async(const char *dst, struct rte_mp_msg *req,
 
 	param->user_reply.nb_sent++;
 
+	if (rte_eal_alarm_set(ts->tv_sec * 1000000 + ts->tv_nsec / 1000,
+			      async_reply_handle, pending_req) < 0) {
+		RTE_LOG(ERR, EAL, "Fail to set alarm for request %s:%s\n",
+			dst, req->name);
+		rte_panic("Fix the above shit to properly free all memory\n");
+	}
+
 	return 0;
 fail:
 	free(pending_req);
@@ -1119,7 +1061,7 @@ rte_mp_request_async(struct rte_mp_msg *req, const struct timespec *ts,
 
 	/* for secondary process, send request to the primary process only */
 	if (rte_eal_process_type() == RTE_PROC_SECONDARY) {
-		ret = mp_request_async(eal_mp_socket_path(), copy, param);
+		ret = mp_request_async(eal_mp_socket_path(), copy, param, ts);
 
 		/* if we didn't send anything, put dummy request on the queue */
 		if (ret == 0 && reply->nb_sent == 0) {
@@ -1162,7 +1104,7 @@ rte_mp_request_async(struct rte_mp_msg *req, const struct timespec *ts,
 		snprintf(path, sizeof(path), "%s/%s", mp_dir_path,
 			 ent->d_name);
 
-		if (mp_request_async(path, copy, param))
+		if (mp_request_async(path, copy, param, ts))
 			ret = -1;
 	}
 	/* if we didn't send anything, put dummy request on the queue */
@@ -1171,9 +1113,6 @@ rte_mp_request_async(struct rte_mp_msg *req, const struct timespec *ts,
 		dummy_used = true;
 	}
 
-	/* trigger async request thread wake up */
-	pthread_cond_signal(&pending_requests.async_cond);
-
 	/* finally, unlock the queue */
 	pthread_mutex_unlock(&pending_requests.lock);
 
-- 
2.17.1

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [dpdk-dev] [PATCH v2 7/7] doc: document IPC callback limitations
  2018-06-15 14:25 [dpdk-dev] [PATCH 0/8] Remove IPC threads Anatoly Burakov
                   ` (16 preceding siblings ...)
  2018-06-26 10:53 ` [dpdk-dev] [PATCH v2 6/7] ipc: remove IPC thread for async requests Anatoly Burakov
@ 2018-06-26 10:53 ` Anatoly Burakov
  17 siblings, 0 replies; 25+ messages in thread
From: Anatoly Burakov @ 2018-06-26 10:53 UTC (permalink / raw)
  To: dev
  Cc: John McNamara, Marko Kovacevic, konstantin.ananyev, thomas,
	bruce.richardson, qi.z.zhang

For asynchronous requests, user callback may be triggered either from
IPC thread or from interrupt thread. Because of this, delivery of
other interrupt-based events such as alarms may not be possible inside
the asynchronous IPC request callback handler. Document this
limitation.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
 doc/guides/prog_guide/multi_proc_support.rst | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/doc/guides/prog_guide/multi_proc_support.rst b/doc/guides/prog_guide/multi_proc_support.rst
index 46a00ec11..1384fe335 100644
--- a/doc/guides/prog_guide/multi_proc_support.rst
+++ b/doc/guides/prog_guide/multi_proc_support.rst
@@ -220,8 +220,8 @@ way communication mechanism, with the requester expecting a response from the
 other side.
 
 Both messages and requests will trigger a named callback on the receiver side.
-These callbacks will be called from within a dedicated IPC thread that is not
-part of EAL lcore threads.
+These callbacks will be called from within a dedicated IPC or interrupt thread
+that are not part of EAL lcore threads.
 
 Registering for incoming messages
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -280,6 +280,13 @@ For asynchronous requests, a function pointer to the callback function must be
 provided instead. This callback will be called when the request either has timed
 out, or will have received a response to all the messages that were sent.
 
+.. warning::
+
+    When an asynchronous request times out, the callback will be called not by
+    a dedicated IPC thread, but rather from EAL interrupt thread. Because of
+    this, it may not be possible for DPDK to trigger another interrupt-based
+    event (such as an alarm) while handling asynchronous IPC callback.
+
 When the callback is called, the original request descriptor will be provided
 (so that it would be possible to determine for which sent message this is a
 callback to), along with a response descriptor like the one described above.
@@ -311,6 +318,12 @@ supported. However, since sending messages (not requests) does not involve an
 IPC thread, sending messages while processing another message or request is
 supported.
 
+Asynchronous request callbacks may be triggered either from IPC thread or from
+interrupt thread, depending on whether the request has timed out. It is
+therefore suggested to avoid waiting for interrupt-based events (such as alarms)
+inside asynchronous IPC request callbacks. This limitation does not apply to
+messages or synchronous requests.
+
 If callbacks spend a long time processing the incoming requests, the requestor
 might time out, so setting the right timeout value on the requestor side is
 imperative.
-- 
2.17.1

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dpdk-dev] [PATCH v2 5/7] eal: bring forward init of interrupt handling
  2018-06-26 10:53 ` [dpdk-dev] [PATCH v2 5/7] eal: bring forward init of interrupt handling Anatoly Burakov
@ 2018-07-12 22:36   ` Thomas Monjalon
  2018-07-13  7:41     ` Burakov, Anatoly
  2018-07-13  8:09     ` David Marchand
  0 siblings, 2 replies; 25+ messages in thread
From: Thomas Monjalon @ 2018-07-12 22:36 UTC (permalink / raw)
  To: Anatoly Burakov
  Cc: dev, Jianfeng Tan, Bruce Richardson, konstantin.ananyev,
	qi.z.zhang, stephen, olivier.matz, david.marchand, ferruh.yigit

26/06/2018 12:53, Anatoly Burakov:
> From: Jianfeng Tan <jianfeng.tan@intel.com>
> 
> Next commit will make asynchronous IPC requests rely on alarm API,
> which in turn relies on interrupts to work. Therefore, move the EAL
> interrupt initialization before IPC initialization to avoid breaking
> IPC in the next commit.
> 
> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> ---
> --- a/lib/librte_eal/linuxapp/eal/eal.c
> +++ b/lib/librte_eal/linuxapp/eal/eal.c
> @@ -839,6 +839,11 @@ rte_eal_init(int argc, char **argv)
>  
>  	rte_config_init();
>  
> +	if (rte_eal_intr_init() < 0) {
> +		rte_eal_init_alert("Cannot init interrupt-handling thread\n");
> +		return -1;
> +	}
> +
>  	/* Put mp channel init before bus scan so that we can init the vdev
>  	 * bus through mp channel in the secondary process before the bus scan.
>  	 */
> @@ -968,11 +973,6 @@ rte_eal_init(int argc, char **argv)
>  		rte_config.master_lcore, (int)thread_id, cpuset,
>  		ret == 0 ? "" : "...");
>  
> -	if (rte_eal_intr_init() < 0) {
> -		rte_eal_init_alert("Cannot init interrupt-handling thread\n");
> -		return -1;
> -	}
> -
>  	RTE_LCORE_FOREACH_SLAVE(i) {

I am almost sure it will bring regressions.

Please think again about the consequences of initializing interrupt thread
before affinity setting, memory init, device init.

Opinions / ideas from anyone?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dpdk-dev] [PATCH v2 5/7] eal: bring forward init of interrupt handling
  2018-07-12 22:36   ` Thomas Monjalon
@ 2018-07-13  7:41     ` Burakov, Anatoly
  2018-07-13  8:09     ` David Marchand
  1 sibling, 0 replies; 25+ messages in thread
From: Burakov, Anatoly @ 2018-07-13  7:41 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: dev, Jianfeng Tan, Bruce Richardson, konstantin.ananyev,
	qi.z.zhang, stephen, olivier.matz, david.marchand, ferruh.yigit

On 12-Jul-18 11:36 PM, Thomas Monjalon wrote:
> 26/06/2018 12:53, Anatoly Burakov:
>> From: Jianfeng Tan <jianfeng.tan@intel.com>
>>
>> Next commit will make asynchronous IPC requests rely on alarm API,
>> which in turn relies on interrupts to work. Therefore, move the EAL
>> interrupt initialization before IPC initialization to avoid breaking
>> IPC in the next commit.
>>
>> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
>> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
>> ---
>> --- a/lib/librte_eal/linuxapp/eal/eal.c
>> +++ b/lib/librte_eal/linuxapp/eal/eal.c
>> @@ -839,6 +839,11 @@ rte_eal_init(int argc, char **argv)
>>   
>>   	rte_config_init();
>>   
>> +	if (rte_eal_intr_init() < 0) {
>> +		rte_eal_init_alert("Cannot init interrupt-handling thread\n");
>> +		return -1;
>> +	}
>> +
>>   	/* Put mp channel init before bus scan so that we can init the vdev
>>   	 * bus through mp channel in the secondary process before the bus scan.
>>   	 */
>> @@ -968,11 +973,6 @@ rte_eal_init(int argc, char **argv)
>>   		rte_config.master_lcore, (int)thread_id, cpuset,
>>   		ret == 0 ? "" : "...");
>>   
>> -	if (rte_eal_intr_init() < 0) {
>> -		rte_eal_init_alert("Cannot init interrupt-handling thread\n");
>> -		return -1;
>> -	}
>> -
>>   	RTE_LCORE_FOREACH_SLAVE(i) {
> 
> I am almost sure it will bring regressions.
> 
> Please think again about the consequences of initializing interrupt thread
> before affinity setting, memory init, device init.
> 
> Opinions / ideas from anyone?
> 

Since interrupt thread is no longer relying on rte_malloc, i do not 
expect there to be any consequences for memory.

I also do not expect any consequences due to moving intr init before 
setting CPU thread affinity, because first of all interrupt thread is 
*already* run before setting thread affinity of the lcores (but after 
setting thread affinity of the master), and it picks its own affinity 
based on parsed coremask anyway, which is already parsed at a point 
where i'm moving it to. Affinities will be set similarly to how they 
were set before, because lcore information is already parsed.

As for device init, that is debatable. The only consequence i can think 
of is if device interrupts happen right after enabling the intr handler 
for that device and this causes some kind of issue or a race. But, we 
already support device hotplug which essentially causes the same 
situation to happen (interrupt handler initialized before bus 
scan/probe), so i'm not really convinced this could have any negative 
consequences either.

-- 
Thanks,
Anatoly

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dpdk-dev] [PATCH v2 5/7] eal: bring forward init of interrupt handling
  2018-07-12 22:36   ` Thomas Monjalon
  2018-07-13  7:41     ` Burakov, Anatoly
@ 2018-07-13  8:09     ` David Marchand
  2018-07-13  9:10       ` Tiwei Bie
  1 sibling, 1 reply; 25+ messages in thread
From: David Marchand @ 2018-07-13  8:09 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Anatoly Burakov, dev, Jianfeng Tan, Bruce Richardson, Ananyev,
	Konstantin, qi.z.zhang, Stephen Hemminger, Olivier Matz,
	Ferruh Yigit, Maxime Coquelin, tiwei.bie

On Fri, Jul 13, 2018 at 12:36 AM, Thomas Monjalon <thomas@monjalon.net> wrote:
> I am almost sure it will bring regressions.
>
> Please think again about the consequences of initializing interrupt thread
> before affinity setting, memory init, device init.

Well, there was an issue vith virtio at one time (interrupt handler
did not have the right iopl for virtio callback).
http://git.dpdk.org/dpdk/commit/lib/librte_eal/linuxapp/eal/eal.c?id=fd6949c55c9a48e81c625138679159829d51ac51

Now... time has passed since then.
It is worth checking the issue is still here, and if it is the case,
revisit this.
I suppose virtio pmd should deal with this itself.


-- 
David Marchand

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dpdk-dev] [PATCH v2 5/7] eal: bring forward init of interrupt handling
  2018-07-13  8:09     ` David Marchand
@ 2018-07-13  9:10       ` Tiwei Bie
  2018-07-13 11:28         ` David Marchand
  0 siblings, 1 reply; 25+ messages in thread
From: Tiwei Bie @ 2018-07-13  9:10 UTC (permalink / raw)
  To: David Marchand
  Cc: Thomas Monjalon, Anatoly Burakov, dev, Jianfeng Tan,
	Bruce Richardson, Ananyev, Konstantin, qi.z.zhang,
	Stephen Hemminger, Olivier Matz, Ferruh Yigit, Maxime Coquelin

On Fri, Jul 13, 2018 at 10:09:42AM +0200, David Marchand wrote:
[...]
> Well, there was an issue vith virtio at one time (interrupt handler
> did not have the right iopl for virtio callback).
> http://git.dpdk.org/dpdk/commit/lib/librte_eal/linuxapp/eal/eal.c?id=fd6949c55c9a48e81c625138679159829d51ac51
> 
> Now... time has passed since then.
> It is worth checking the issue is still here, and if it is the case,
> revisit this.

Now rte_virtio_pmd_init() which calls rte_eal_iopl_init()
is called as a constructor [1], so the issue isn't there
any more.

[1] http://git.dpdk.org/dpdk/tree/drivers/net/virtio/virtio_ethdev.c?id=1d406458db47#n1789

Best regards,
Tiwei Bie

> I suppose virtio pmd should deal with this itself.
> 
> 
> -- 
> David Marchand

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dpdk-dev] [PATCH v2 0/7] Remove asynchronous IPC thread
  2018-06-26 10:53 ` [dpdk-dev] [PATCH v2 0/7] Remove asynchronous IPC thread Anatoly Burakov
@ 2018-07-13 10:44   ` Thomas Monjalon
  0 siblings, 0 replies; 25+ messages in thread
From: Thomas Monjalon @ 2018-07-13 10:44 UTC (permalink / raw)
  To: Anatoly Burakov; +Cc: dev, konstantin.ananyev, bruce.richardson, qi.z.zhang

> Anatoly Burakov (4):
>   eal/bsdapp: add interrupt thread
>   eal/bsdapp: add alarm support
>   ipc: remove IPC thread for async requests
>   doc: document IPC callback limitations
> 
> Jianfeng Tan (3):
>   eal/linux: use glibc malloc in alarm
>   eal/linux: use glibc malloc in interrupt handling
>   eal: bring forward init of interrupt handling

Applied, thanks

Crossing fingers that moving interrupt init will have no consequence :)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dpdk-dev] [PATCH v2 5/7] eal: bring forward init of interrupt handling
  2018-07-13  9:10       ` Tiwei Bie
@ 2018-07-13 11:28         ` David Marchand
  0 siblings, 0 replies; 25+ messages in thread
From: David Marchand @ 2018-07-13 11:28 UTC (permalink / raw)
  To: Tiwei Bie
  Cc: Thomas Monjalon, Anatoly Burakov, dev, Jianfeng Tan,
	Bruce Richardson, Ananyev, Konstantin, qi.z.zhang,
	Stephen Hemminger, Olivier Matz, Ferruh Yigit, Maxime Coquelin

On Fri, Jul 13, 2018 at 11:10 AM, Tiwei Bie <tiwei.bie@intel.com> wrote:
> On Fri, Jul 13, 2018 at 10:09:42AM +0200, David Marchand wrote:
> [...]
>> Well, there was an issue vith virtio at one time (interrupt handler
>> did not have the right iopl for virtio callback).
>> http://git.dpdk.org/dpdk/commit/lib/librte_eal/linuxapp/eal/eal.c?id=fd6949c55c9a48e81c625138679159829d51ac51
>>
>> Now... time has passed since then.
>> It is worth checking the issue is still here, and if it is the case,
>> revisit this.
>
> Now rte_virtio_pmd_init() which calls rte_eal_iopl_init()
> is called as a constructor [1], so the issue isn't there
> any more.

If the virtio pmd is loaded dynamically, eal_plugins_init() is still
called before rte_eal_intr_init().
Yes, for this case, it looks to be fine with Anatoly change.


-- 
David Marchand

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2018-07-13 11:29 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-15 14:25 [dpdk-dev] [PATCH 0/8] Remove IPC threads Anatoly Burakov
2018-06-15 14:25 ` [dpdk-dev] [PATCH 1/8] eal/linux: use glibc malloc in alarm Anatoly Burakov
2018-06-15 14:25 ` [dpdk-dev] [PATCH 2/8] eal/linux: use glibc malloc in interrupt handling Anatoly Burakov
2018-06-15 14:25 ` [dpdk-dev] [PATCH 3/8] ipc: remove IPC thread for async requests Anatoly Burakov
2018-06-15 14:25 ` [dpdk-dev] [PATCH 4/8] eal: bring forward init of interrupt handling Anatoly Burakov
2018-06-15 14:25 ` [dpdk-dev] [PATCH 5/8] eal: add IPC type for interrupt thread Anatoly Burakov
2018-06-15 14:25 ` [dpdk-dev] [PATCH 6/8] eal/bsdapp: add " Anatoly Burakov
2018-06-15 14:25 ` [dpdk-dev] [PATCH 7/8] eal/bsdapp: add alarm support Anatoly Burakov
2018-06-15 14:25 ` [dpdk-dev] [PATCH 8/8] ipc: remove main IPC thread Anatoly Burakov
2018-06-26  1:19 ` [dpdk-dev] [PATCH 0/8] Remove IPC threads Zhang, Qi Z
2018-06-26  7:03 ` Zhang, Qi Z
2018-06-26 10:53 ` [dpdk-dev] [PATCH v2 0/7] Remove asynchronous IPC thread Anatoly Burakov
2018-07-13 10:44   ` Thomas Monjalon
2018-06-26 10:53 ` [dpdk-dev] [PATCH v2 1/7] eal/linux: use glibc malloc in alarm Anatoly Burakov
2018-06-26 10:53 ` [dpdk-dev] [PATCH v2 2/7] eal/linux: use glibc malloc in interrupt handling Anatoly Burakov
2018-06-26 10:53 ` [dpdk-dev] [PATCH v2 3/7] eal/bsdapp: add interrupt thread Anatoly Burakov
2018-06-26 10:53 ` [dpdk-dev] [PATCH v2 4/7] eal/bsdapp: add alarm support Anatoly Burakov
2018-06-26 10:53 ` [dpdk-dev] [PATCH v2 5/7] eal: bring forward init of interrupt handling Anatoly Burakov
2018-07-12 22:36   ` Thomas Monjalon
2018-07-13  7:41     ` Burakov, Anatoly
2018-07-13  8:09     ` David Marchand
2018-07-13  9:10       ` Tiwei Bie
2018-07-13 11:28         ` David Marchand
2018-06-26 10:53 ` [dpdk-dev] [PATCH v2 6/7] ipc: remove IPC thread for async requests Anatoly Burakov
2018-06-26 10:53 ` [dpdk-dev] [PATCH v2 7/7] doc: document IPC callback limitations Anatoly Burakov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).