From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id 9D2F01B5AD for ; Thu, 13 Dec 2018 23:27:12 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 13 Dec 2018 14:27:11 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,350,1539673200"; d="scan'208";a="109363171" Received: from txasoft-yocto.an.intel.com (HELO txasoft-yocto.an.intel.com.) ([10.123.72.192]) by fmsmga008.fm.intel.com with ESMTP; 13 Dec 2018 14:27:10 -0800 From: Erik Gabriel Carrillo To: rsanford@akamai.com Cc: stephen@networkplumber.org, jerin.jacob@caviumnetworks.com, pbhagavatula@caviumnetworks.com, dev@dpdk.org Date: Thu, 13 Dec 2018 16:26:35 -0600 Message-Id: <1544739996-26011-2-git-send-email-erik.g.carrillo@intel.com> X-Mailer: git-send-email 1.7.10 In-Reply-To: <1544739996-26011-1-git-send-email-erik.g.carrillo@intel.com> References: <1544205180-31546-1-git-send-email-erik.g.carrillo@intel.com> <1544739996-26011-1-git-send-email-erik.g.carrillo@intel.com> Subject: [dpdk-dev] [PATCH v3 1/2] timer: allow timer management in shared memory X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Dec 2018 22:27:13 -0000 Currently, the timer library uses a per-process table of structures to manage skiplists of timers presumably because timers contain arbitrary function pointers whose value may not resolve properly in other processes. However, if the same callback is used handle all timers, and that callback is only invoked in one process, then it woud be safe to allow the data structures to be allocated in shared memory, and to allow secondary processes to modify the timer lists. This would let timers be used in more multi-process scenarios. The library's global variables are wrapped with a struct, and an array of these structures is created in shared memory. The original APIs are updated to reference the zeroth entry in the array. This maintains the original behavior for both primary and secondary processes since the set intersection of their coremasks should be empty [1]. New APIs are introduced to enable the allocation/deallocation of other entries in the array. New variants of the APIs used to start and stop timers are introduced; they allow a caller to specify which array entry should be used to locate the timer list to insert into or delete from. Finally, a new variant of rte_timer_manage() is introduced, which allows a caller to specify which array entry should be used to locate the timer lists to process; it can also process multiple timer lists per invocation. [1] https://doc.dpdk.org/guides/prog_guide/multi_proc_support.html#multi-process-limitations Signed-off-by: Erik Gabriel Carrillo --- lib/librte_timer/Makefile | 1 + lib/librte_timer/rte_timer.c | 519 ++++++++++++++++++++++++++++++--- lib/librte_timer/rte_timer.h | 226 +++++++++++++- lib/librte_timer/rte_timer_version.map | 22 ++ 4 files changed, 723 insertions(+), 45 deletions(-) diff --git a/lib/librte_timer/Makefile b/lib/librte_timer/Makefile index 4ebd528..8ec63f4 100644 --- a/lib/librte_timer/Makefile +++ b/lib/librte_timer/Makefile @@ -6,6 +6,7 @@ include $(RTE_SDK)/mk/rte.vars.mk # library name LIB = librte_timer.a +CFLAGS += -DALLOW_EXPERIMENTAL_API CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3 LDLIBS += -lrte_eal diff --git a/lib/librte_timer/rte_timer.c b/lib/librte_timer/rte_timer.c index 30c7b0a..d761cda 100644 --- a/lib/librte_timer/rte_timer.c +++ b/lib/librte_timer/rte_timer.c @@ -5,6 +5,7 @@ #include #include #include +#include #include #include #include @@ -21,11 +22,15 @@ #include #include #include +#include +#include +#include #include "rte_timer.h" -LIST_HEAD(rte_timer_list, rte_timer); - +/** + * Per-lcore info for timers. + */ struct priv_timer { struct rte_timer pending_head; /**< dummy timer instance to head up list */ rte_spinlock_t list_lock; /**< lock to protect list access */ @@ -48,25 +53,84 @@ struct priv_timer { #endif } __rte_cache_aligned; -/** per-lcore private info for timers */ -static struct priv_timer priv_timer[RTE_MAX_LCORE]; +#define FL_ALLOCATED (1 << 0) +struct rte_timer_data { + struct priv_timer priv_timer[RTE_MAX_LCORE]; + uint8_t internal_flags; +}; + +#define RTE_MAX_DATA_ELS 64 +static struct rte_timer_data *rte_timer_data_arr; +static uint32_t default_data_id; +static uint32_t rte_timer_subsystem_initialized; + +/* For maintaining older interfaces for a period */ +static struct rte_timer_data default_timer_data; /* when debug is enabled, store some statistics */ #ifdef RTE_LIBRTE_TIMER_DEBUG -#define __TIMER_STAT_ADD(name, n) do { \ +#define __TIMER_STAT_ADD(priv_timer, name, n) do { \ unsigned __lcore_id = rte_lcore_id(); \ if (__lcore_id < RTE_MAX_LCORE) \ priv_timer[__lcore_id].stats.name += (n); \ } while(0) #else -#define __TIMER_STAT_ADD(name, n) do {} while(0) +#define __TIMER_STAT_ADD(priv_timer, name, n) do {} while (0) #endif -/* Init the timer library. */ +static inline int +timer_data_valid(uint32_t id) +{ + return !!(rte_timer_data_arr[id].internal_flags & FL_ALLOCATED); +} + +/* validate ID and retrieve timer data pointer, or return error value */ +#define TIMER_DATA_VALID_GET_OR_ERR_RET(id, timer_data, retval) do { \ + if (id >= RTE_MAX_DATA_ELS || !timer_data_valid(id)) \ + return retval; \ + timer_data = &rte_timer_data_arr[id]; \ +} while (0) + +int __rte_experimental +rte_timer_data_alloc(uint32_t *id_ptr) +{ + int i; + struct rte_timer_data *data; + + if (!rte_timer_subsystem_initialized) + return -ENOMEM; + + for (i = 0; i < RTE_MAX_DATA_ELS; i++) { + data = &rte_timer_data_arr[i]; + if (!(data->internal_flags & FL_ALLOCATED)) { + data->internal_flags |= FL_ALLOCATED; + + if (id_ptr) + *id_ptr = i; + + return 0; + } + } + + return -ENOSPC; +} + +int __rte_experimental +rte_timer_data_dealloc(uint32_t id) +{ + struct rte_timer_data *timer_data; + TIMER_DATA_VALID_GET_OR_ERR_RET(id, timer_data, -EINVAL); + + timer_data->internal_flags &= ~(FL_ALLOCATED); + + return 0; +} + void -rte_timer_subsystem_init(void) +rte_timer_subsystem_init_v20(void) { unsigned lcore_id; + struct priv_timer *priv_timer = default_timer_data.priv_timer; /* since priv_timer is static, it's zeroed by default, so only init some * fields. @@ -76,6 +140,76 @@ rte_timer_subsystem_init(void) priv_timer[lcore_id].prev_lcore = lcore_id; } } +VERSION_SYMBOL(rte_timer_subsystem_init, _v20, 2.0); + +/* Init the timer library. Allocate an array of timer data structs in shared + * memory, and allocate the zeroth entry for use with original timer + * APIs. Since the intersection of the sets of lcore ids in primary and + * secondary processes should be empty, the zeroth entry can be shared by + * multiple processes. + */ +int +rte_timer_subsystem_init_v1902(void) +{ + const struct rte_memzone *mz; + struct rte_timer_data *data; + int i, lcore_id; + static const char *mz_name = "rte_timer_mz"; + + if (rte_timer_subsystem_initialized) + return -EALREADY; + + if (rte_eal_process_type() != RTE_PROC_PRIMARY) { + mz = rte_memzone_lookup(mz_name); + if (mz == NULL) + return -EEXIST; + + rte_timer_data_arr = mz->addr; + + rte_timer_data_arr[default_data_id].internal_flags |= + FL_ALLOCATED; + + rte_timer_subsystem_initialized = 1; + + return 0; + } + + mz = rte_memzone_reserve_aligned(mz_name, + RTE_MAX_DATA_ELS * sizeof(*rte_timer_data_arr), + SOCKET_ID_ANY, 0, RTE_CACHE_LINE_SIZE); + if (mz == NULL) + return -ENOMEM; + + rte_timer_data_arr = mz->addr; + + for (i = 0; i < RTE_MAX_DATA_ELS; i++) { + data = &rte_timer_data_arr[i]; + + for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) { + rte_spinlock_init( + &data->priv_timer[lcore_id].list_lock); + data->priv_timer[lcore_id].prev_lcore = lcore_id; + } + } + + rte_timer_data_arr[default_data_id].internal_flags |= FL_ALLOCATED; + + rte_timer_subsystem_initialized = 1; + + return 0; +} +MAP_STATIC_SYMBOL(int rte_timer_subsystem_init(void), + rte_timer_subsystem_init_v1902); +BIND_DEFAULT_SYMBOL(rte_timer_subsystem_init, _v1902, 19.02); + +void __rte_experimental +rte_timer_subsystem_finalize(void) +{ + if (rte_timer_data_arr) + rte_free(rte_timer_data_arr); + + rte_timer_subsystem_initialized = 0; +} /* Initialize the timer handle tim for use */ void @@ -95,7 +229,8 @@ rte_timer_init(struct rte_timer *tim) */ static int timer_set_config_state(struct rte_timer *tim, - union rte_timer_status *ret_prev_status) + union rte_timer_status *ret_prev_status, + struct priv_timer *priv_timer) { union rte_timer_status prev_status, status; int success = 0; @@ -207,7 +342,7 @@ timer_get_skiplist_level(unsigned curr_depth) */ static void timer_get_prev_entries(uint64_t time_val, unsigned tim_lcore, - struct rte_timer **prev) + struct rte_timer **prev, struct priv_timer *priv_timer) { unsigned lvl = priv_timer[tim_lcore].curr_skiplist_depth; prev[lvl] = &priv_timer[tim_lcore].pending_head; @@ -226,13 +361,15 @@ timer_get_prev_entries(uint64_t time_val, unsigned tim_lcore, */ static void timer_get_prev_entries_for_node(struct rte_timer *tim, unsigned tim_lcore, - struct rte_timer **prev) + struct rte_timer **prev, + struct priv_timer *priv_timer) { int i; + /* to get a specific entry in the list, look for just lower than the time * values, and then increment on each level individually if necessary */ - timer_get_prev_entries(tim->expire - 1, tim_lcore, prev); + timer_get_prev_entries(tim->expire - 1, tim_lcore, prev, priv_timer); for (i = priv_timer[tim_lcore].curr_skiplist_depth - 1; i >= 0; i--) { while (prev[i]->sl_next[i] != NULL && prev[i]->sl_next[i] != tim && @@ -247,14 +384,15 @@ timer_get_prev_entries_for_node(struct rte_timer *tim, unsigned tim_lcore, * timer must not be in a list */ static void -timer_add(struct rte_timer *tim, unsigned int tim_lcore) +timer_add(struct rte_timer *tim, unsigned int tim_lcore, + struct priv_timer *priv_timer) { unsigned lvl; struct rte_timer *prev[MAX_SKIPLIST_DEPTH+1]; /* find where exactly this element goes in the list of elements * for each depth. */ - timer_get_prev_entries(tim->expire, tim_lcore, prev); + timer_get_prev_entries(tim->expire, tim_lcore, prev, priv_timer); /* now assign it a new level and add at that level */ const unsigned tim_level = timer_get_skiplist_level( @@ -284,7 +422,7 @@ timer_add(struct rte_timer *tim, unsigned int tim_lcore) */ static void timer_del(struct rte_timer *tim, union rte_timer_status prev_status, - int local_is_locked) + int local_is_locked, struct priv_timer *priv_timer) { unsigned lcore_id = rte_lcore_id(); unsigned prev_owner = prev_status.owner; @@ -304,7 +442,7 @@ timer_del(struct rte_timer *tim, union rte_timer_status prev_status, ((tim->sl_next[0] == NULL) ? 0 : tim->sl_next[0]->expire); /* adjust pointers from previous entries to point past this */ - timer_get_prev_entries_for_node(tim, prev_owner, prev); + timer_get_prev_entries_for_node(tim, prev_owner, prev, priv_timer); for (i = priv_timer[prev_owner].curr_skiplist_depth - 1; i >= 0; i--) { if (prev[i]->sl_next[i] == tim) prev[i]->sl_next[i] = tim->sl_next[i]; @@ -326,11 +464,13 @@ static int __rte_timer_reset(struct rte_timer *tim, uint64_t expire, uint64_t period, unsigned tim_lcore, rte_timer_cb_t fct, void *arg, - int local_is_locked) + int local_is_locked, + struct rte_timer_data *timer_data) { union rte_timer_status prev_status, status; int ret; unsigned lcore_id = rte_lcore_id(); + struct priv_timer *priv_timer = timer_data->priv_timer; /* round robin for tim_lcore */ if (tim_lcore == (unsigned)LCORE_ID_ANY) { @@ -348,11 +488,11 @@ __rte_timer_reset(struct rte_timer *tim, uint64_t expire, /* wait that the timer is in correct status before update, * and mark it as being configured */ - ret = timer_set_config_state(tim, &prev_status); + ret = timer_set_config_state(tim, &prev_status, priv_timer); if (ret < 0) return -1; - __TIMER_STAT_ADD(reset, 1); + __TIMER_STAT_ADD(priv_timer, reset, 1); if (prev_status.state == RTE_TIMER_RUNNING && lcore_id < RTE_MAX_LCORE) { priv_timer[lcore_id].updated = 1; @@ -360,8 +500,8 @@ __rte_timer_reset(struct rte_timer *tim, uint64_t expire, /* remove it from list */ if (prev_status.state == RTE_TIMER_PENDING) { - timer_del(tim, prev_status, local_is_locked); - __TIMER_STAT_ADD(pending, -1); + timer_del(tim, prev_status, local_is_locked, priv_timer); + __TIMER_STAT_ADD(priv_timer, pending, -1); } tim->period = period; @@ -376,8 +516,8 @@ __rte_timer_reset(struct rte_timer *tim, uint64_t expire, if (tim_lcore != lcore_id || !local_is_locked) rte_spinlock_lock(&priv_timer[tim_lcore].list_lock); - __TIMER_STAT_ADD(pending, 1); - timer_add(tim, tim_lcore); + __TIMER_STAT_ADD(priv_timer, pending, 1); + timer_add(tim, tim_lcore, priv_timer); /* update state: as we are in CONFIG state, only us can modify * the state so we don't need to use cmpset() here */ @@ -394,9 +534,9 @@ __rte_timer_reset(struct rte_timer *tim, uint64_t expire, /* Reset and start the timer associated with the timer handle tim */ int -rte_timer_reset(struct rte_timer *tim, uint64_t ticks, - enum rte_timer_type type, unsigned tim_lcore, - rte_timer_cb_t fct, void *arg) +rte_timer_reset_v20(struct rte_timer *tim, uint64_t ticks, + enum rte_timer_type type, unsigned int tim_lcore, + rte_timer_cb_t fct, void *arg) { uint64_t cur_time = rte_get_timer_cycles(); uint64_t period; @@ -412,7 +552,48 @@ rte_timer_reset(struct rte_timer *tim, uint64_t ticks, period = 0; return __rte_timer_reset(tim, cur_time + ticks, period, tim_lcore, - fct, arg, 0); + fct, arg, 0, &default_timer_data); +} +VERSION_SYMBOL(rte_timer_reset, _v20, 2.0); + +int +rte_timer_reset_v1902(struct rte_timer *tim, uint64_t ticks, + enum rte_timer_type type, unsigned int tim_lcore, + rte_timer_cb_t fct, void *arg) +{ + return rte_timer_alt_reset(default_data_id, tim, ticks, type, + tim_lcore, fct, arg); +} +MAP_STATIC_SYMBOL(int rte_timer_reset(struct rte_timer *tim, uint64_t ticks, + enum rte_timer_type type, + unsigned int tim_lcore, + rte_timer_cb_t fct, void *arg), + rte_timer_reset_v1902); +BIND_DEFAULT_SYMBOL(rte_timer_reset, _v1902, 19.02); + +int __rte_experimental +rte_timer_alt_reset(uint32_t timer_data_id, struct rte_timer *tim, + uint64_t ticks, enum rte_timer_type type, + unsigned int tim_lcore, rte_timer_cb_t fct, void *arg) +{ + uint64_t cur_time = rte_get_timer_cycles(); + uint64_t period; + struct rte_timer_data *timer_data; + + TIMER_DATA_VALID_GET_OR_ERR_RET(timer_data_id, timer_data, -EINVAL); + + if (unlikely((tim_lcore != (unsigned int)LCORE_ID_ANY) && + !(rte_lcore_is_enabled(tim_lcore) || + rte_lcore_has_role(tim_lcore, ROLE_SERVICE)))) + return -1; + + if (type == PERIODICAL) + period = ticks; + else + period = 0; + + return __rte_timer_reset(tim, cur_time + ticks, period, tim_lcore, + fct, arg, 0, timer_data); } /* loop until rte_timer_reset() succeed */ @@ -426,21 +607,22 @@ rte_timer_reset_sync(struct rte_timer *tim, uint64_t ticks, rte_pause(); } -/* Stop the timer associated with the timer handle tim */ -int -rte_timer_stop(struct rte_timer *tim) +static int +__rte_timer_stop(struct rte_timer *tim, int local_is_locked, + struct rte_timer_data *timer_data) { union rte_timer_status prev_status, status; unsigned lcore_id = rte_lcore_id(); int ret; + struct priv_timer *priv_timer = timer_data->priv_timer; /* wait that the timer is in correct status before update, * and mark it as being configured */ - ret = timer_set_config_state(tim, &prev_status); + ret = timer_set_config_state(tim, &prev_status, priv_timer); if (ret < 0) return -1; - __TIMER_STAT_ADD(stop, 1); + __TIMER_STAT_ADD(priv_timer, stop, 1); if (prev_status.state == RTE_TIMER_RUNNING && lcore_id < RTE_MAX_LCORE) { priv_timer[lcore_id].updated = 1; @@ -448,8 +630,8 @@ rte_timer_stop(struct rte_timer *tim) /* remove it from list */ if (prev_status.state == RTE_TIMER_PENDING) { - timer_del(tim, prev_status, 0); - __TIMER_STAT_ADD(pending, -1); + timer_del(tim, prev_status, local_is_locked, priv_timer); + __TIMER_STAT_ADD(priv_timer, pending, -1); } /* mark timer as stopped */ @@ -461,6 +643,33 @@ rte_timer_stop(struct rte_timer *tim) return 0; } +/* Stop the timer associated with the timer handle tim */ +int +rte_timer_stop_v20(struct rte_timer *tim) +{ + return __rte_timer_stop(tim, 0, &default_timer_data); +} +VERSION_SYMBOL(rte_timer_stop, _v20, 2.0); + +int +rte_timer_stop_v1902(struct rte_timer *tim) +{ + return rte_timer_alt_stop(default_data_id, tim); +} +MAP_STATIC_SYMBOL(int rte_timer_stop(struct rte_timer *tim), + rte_timer_stop_v1902); +BIND_DEFAULT_SYMBOL(rte_timer_stop, _v1902, 19.02); + +int __rte_experimental +rte_timer_alt_stop(uint32_t timer_data_id, struct rte_timer *tim) +{ + struct rte_timer_data *timer_data; + + TIMER_DATA_VALID_GET_OR_ERR_RET(timer_data_id, timer_data, -EINVAL); + + return __rte_timer_stop(tim, 0, timer_data); +} + /* loop until rte_timer_stop() succeed */ void rte_timer_stop_sync(struct rte_timer *tim) @@ -477,7 +686,8 @@ rte_timer_pending(struct rte_timer *tim) } /* must be called periodically, run all timer that expired */ -void rte_timer_manage(void) +static void +__rte_timer_manage(struct rte_timer_data *timer_data) { union rte_timer_status status; struct rte_timer *tim, *next_tim; @@ -486,11 +696,12 @@ void rte_timer_manage(void) struct rte_timer *prev[MAX_SKIPLIST_DEPTH + 1]; uint64_t cur_time; int i, ret; + struct priv_timer *priv_timer = timer_data->priv_timer; /* timer manager only runs on EAL thread with valid lcore_id */ assert(lcore_id < RTE_MAX_LCORE); - __TIMER_STAT_ADD(manage, 1); + __TIMER_STAT_ADD(priv_timer, manage, 1); /* optimize for the case where per-cpu list is empty */ if (priv_timer[lcore_id].pending_head.sl_next[0] == NULL) return; @@ -518,7 +729,7 @@ void rte_timer_manage(void) tim = priv_timer[lcore_id].pending_head.sl_next[0]; /* break the existing list at current time point */ - timer_get_prev_entries(cur_time, lcore_id, prev); + timer_get_prev_entries(cur_time, lcore_id, prev, priv_timer); for (i = priv_timer[lcore_id].curr_skiplist_depth -1; i >= 0; i--) { if (prev[i] == &priv_timer[lcore_id].pending_head) continue; @@ -563,7 +774,7 @@ void rte_timer_manage(void) /* execute callback function with list unlocked */ tim->f(tim, tim->arg); - __TIMER_STAT_ADD(pending, -1); + __TIMER_STAT_ADD(priv_timer, pending, -1); /* the timer was stopped or reloaded by the callback * function, we have nothing to do here */ if (priv_timer[lcore_id].updated == 1) @@ -580,24 +791,222 @@ void rte_timer_manage(void) /* keep it in list and mark timer as pending */ rte_spinlock_lock(&priv_timer[lcore_id].list_lock); status.state = RTE_TIMER_PENDING; - __TIMER_STAT_ADD(pending, 1); + __TIMER_STAT_ADD(priv_timer, pending, 1); status.owner = (int16_t)lcore_id; rte_wmb(); tim->status.u32 = status.u32; __rte_timer_reset(tim, tim->expire + tim->period, - tim->period, lcore_id, tim->f, tim->arg, 1); + tim->period, lcore_id, tim->f, tim->arg, 1, + timer_data); rte_spinlock_unlock(&priv_timer[lcore_id].list_lock); } } priv_timer[lcore_id].running_tim = NULL; } +void +rte_timer_manage_v20(void) +{ + __rte_timer_manage(&default_timer_data); +} +VERSION_SYMBOL(rte_timer_manage, _v20, 2.0); + +int +rte_timer_manage_v1902(void) +{ + struct rte_timer_data *timer_data; + + TIMER_DATA_VALID_GET_OR_ERR_RET(default_data_id, timer_data, -EINVAL); + + __rte_timer_manage(timer_data); + + return 0; +} +MAP_STATIC_SYMBOL(int rte_timer_manage(void), rte_timer_manage_v1902); +BIND_DEFAULT_SYMBOL(rte_timer_manage, _v1902, 19.02); + +int __rte_experimental +rte_timer_alt_manage(uint32_t timer_data_id, + unsigned int *poll_lcores, + int nb_poll_lcores, + rte_timer_alt_manage_cb_t f) +{ + union rte_timer_status status; + struct rte_timer *tim, *next_tim, **pprev; + struct rte_timer *run_first_tims[RTE_MAX_LCORE]; + unsigned int runlist_lcore_ids[RTE_MAX_LCORE]; + unsigned int this_lcore = rte_lcore_id(); + struct rte_timer *prev[MAX_SKIPLIST_DEPTH + 1]; + uint64_t cur_time; + int i, j, ret; + int nb_runlists = 0; + struct rte_timer_data *data; + struct priv_timer *privp; + uint32_t poll_lcore; + + TIMER_DATA_VALID_GET_OR_ERR_RET(timer_data_id, data, -EINVAL); + + /* timer manager only runs on EAL thread with valid lcore_id */ + assert(this_lcore < RTE_MAX_LCORE); + + __TIMER_STAT_ADD(data->priv_timer, manage, 1); + + if (poll_lcores == NULL) { + poll_lcores = (unsigned int []){rte_lcore_id()}; + nb_poll_lcores = 1; + } + + for (i = 0, poll_lcore = poll_lcores[i]; i < nb_poll_lcores; + poll_lcore = poll_lcores[++i]) { + privp = &data->priv_timer[poll_lcore]; + + /* optimize for the case where per-cpu list is empty */ + if (privp->pending_head.sl_next[0] == NULL) + continue; + cur_time = rte_get_timer_cycles(); + +#ifdef RTE_ARCH_64 + /* on 64-bit the value cached in the pending_head.expired will + * be updated atomically, so we can consult that for a quick + * check here outside the lock + */ + if (likely(privp->pending_head.expire > cur_time)) + continue; +#endif + + /* browse ordered list, add expired timers in 'expired' list */ + rte_spinlock_lock(&privp->list_lock); + + /* if nothing to do just unlock and return */ + if (privp->pending_head.sl_next[0] == NULL || + privp->pending_head.sl_next[0]->expire > cur_time) { + rte_spinlock_unlock(&privp->list_lock); + continue; + } + + /* save start of list of expired timers */ + tim = privp->pending_head.sl_next[0]; + + /* break the existing list at current time point */ + timer_get_prev_entries(cur_time, poll_lcore, prev, + data->priv_timer); + for (j = privp->curr_skiplist_depth - 1; j >= 0; j--) { + if (prev[j] == &privp->pending_head) + continue; + privp->pending_head.sl_next[j] = + prev[j]->sl_next[j]; + if (prev[j]->sl_next[j] == NULL) + privp->curr_skiplist_depth--; + + prev[j]->sl_next[j] = NULL; + } + + /* transition run-list from PENDING to RUNNING */ + run_first_tims[nb_runlists] = tim; + runlist_lcore_ids[nb_runlists] = poll_lcore; + pprev = &run_first_tims[nb_runlists]; + nb_runlists++; + + for ( ; tim != NULL; tim = next_tim) { + next_tim = tim->sl_next[0]; + + ret = timer_set_running_state(tim); + if (likely(ret == 0)) { + pprev = &tim->sl_next[0]; + } else { + /* another core is trying to re-config this one, + * remove it from local expired list + */ + *pprev = next_tim; + } + } + + /* update the next to expire timer value */ + privp->pending_head.expire = + (privp->pending_head.sl_next[0] == NULL) ? 0 : + privp->pending_head.sl_next[0]->expire; + + rte_spinlock_unlock(&privp->list_lock); + } + + /* Now process the run lists */ + while (1) { + bool done = true; + uint64_t min_expire = UINT64_MAX; + int min_idx = 0; + + /* Find the next oldest timer to process */ + for (i = 0; i < nb_runlists; i++) { + tim = run_first_tims[i]; + + if (tim != NULL && tim->expire < min_expire) { + min_expire = tim->expire; + min_idx = i; + done = false; + } + } + + if (done) + break; + + tim = run_first_tims[min_idx]; + privp = &data->priv_timer[runlist_lcore_ids[min_idx]]; + + /* Move down the runlist from which we picked a timer to + * execute + */ + run_first_tims[min_idx] = run_first_tims[min_idx]->sl_next[0]; + + privp->updated = 0; + privp->running_tim = tim; + + /* Call the provided callback function */ + f(tim); + + __TIMER_STAT_ADD(privp, pending, -1); + + /* the timer was stopped or reloaded by the callback + * function, we have nothing to do here + */ + if (privp->updated == 1) + continue; + + if (tim->period == 0) { + /* remove from done list and mark timer as stopped */ + status.state = RTE_TIMER_STOP; + status.owner = RTE_TIMER_NO_OWNER; + rte_wmb(); + tim->status.u32 = status.u32; + } else { + /* keep it in list and mark timer as pending */ + rte_spinlock_lock( + &data->priv_timer[this_lcore].list_lock); + status.state = RTE_TIMER_PENDING; + __TIMER_STAT_ADD(data->priv_timer, pending, 1); + status.owner = (int16_t)this_lcore; + rte_wmb(); + tim->status.u32 = status.u32; + __rte_timer_reset(tim, tim->expire + tim->period, + tim->period, this_lcore, tim->f, tim->arg, 1, + data); + rte_spinlock_unlock( + &data->priv_timer[this_lcore].list_lock); + } + + privp->running_tim = NULL; + } + + return 0; +} + /* dump statistics about timers */ -void rte_timer_dump_stats(FILE *f) +static void +__rte_timer_dump_stats(struct rte_timer_data *timer_data __rte_unused, FILE *f) { #ifdef RTE_LIBRTE_TIMER_DEBUG struct rte_timer_debug_stats sum; unsigned lcore_id; + struct priv_timer *priv_timer = timer_data->priv_timer; memset(&sum, 0, sizeof(sum)); for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) { @@ -615,3 +1024,31 @@ void rte_timer_dump_stats(FILE *f) fprintf(f, "No timer statistics, RTE_LIBRTE_TIMER_DEBUG is disabled\n"); #endif } + +void +rte_timer_dump_stats_v20(FILE *f) +{ + __rte_timer_dump_stats(&default_timer_data, f); +} +VERSION_SYMBOL(rte_timer_dump_stats, _v20, 2.0); + +int +rte_timer_dump_stats_v1902(FILE *f) +{ + return rte_timer_alt_dump_stats(default_data_id, f); +} +MAP_STATIC_SYMBOL(int rte_timer_dump_stats(FILE *f), + rte_timer_dump_stats_v1902); +BIND_DEFAULT_SYMBOL(rte_timer_dump_stats, _v1902, 19.02); + +int __rte_experimental +rte_timer_alt_dump_stats(uint32_t timer_data_id __rte_unused, FILE *f) +{ + struct rte_timer_data *timer_data; + + TIMER_DATA_VALID_GET_OR_ERR_RET(timer_data_id, timer_data, -EINVAL); + + __rte_timer_dump_stats(timer_data, f); + + return 0; +} diff --git a/lib/librte_timer/rte_timer.h b/lib/librte_timer/rte_timer.h index 9b95cd2..82f5fba 100644 --- a/lib/librte_timer/rte_timer.h +++ b/lib/librte_timer/rte_timer.h @@ -39,6 +39,7 @@ #include #include #include +#include #ifdef __cplusplus extern "C" { @@ -132,12 +133,68 @@ struct rte_timer #endif /** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Allocate a timer data instance in shared memory to track a set of pending + * timer lists. + * + * @param id_ptr + * Pointer to variable into which to write the identifier of the allocated + * timer data instance. + * + * @return + * - 0: Success + * - -ENOSPC: maximum number of timer data instances already allocated + */ +int __rte_experimental rte_timer_data_alloc(uint32_t *id_ptr); + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Deallocate a timer data instance. + * + * @param id + * Identifier of the timer data instance to deallocate. + * + * @return + * - 0: Success + * - -EINVAL: invalid timer data instance identifier + */ +int __rte_experimental rte_timer_data_dealloc(uint32_t id); + +/** * Initialize the timer library. * * Initializes internal variables (list, locks and so on) for the RTE * timer library. */ -void rte_timer_subsystem_init(void); +void rte_timer_subsystem_init_v20(void); + +/** + * Initialize the timer library. + * + * Initializes internal variables (list, locks and so on) for the RTE + * timer library. + * + * @return + * - 0: Success + * - -EEXIST: Returned in secondary process when primary process has not + * yet initialized the timer subsystem + * - -ENOMEM: Unable to allocate memory needed to initialize timer + * subsystem + */ +int rte_timer_subsystem_init_v1902(void); +int rte_timer_subsystem_init(void); + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Free timer subsystem resources. + */ +void __rte_experimental rte_timer_subsystem_finalize(void); /** * Initialize a timer handle. @@ -193,6 +250,12 @@ void rte_timer_init(struct rte_timer *tim); * - 0: Success; the timer is scheduled. * - (-1): Timer is in the RUNNING or CONFIG state. */ +int rte_timer_reset_v20(struct rte_timer *tim, uint64_t ticks, + enum rte_timer_type type, unsigned int tim_lcore, + rte_timer_cb_t fct, void *arg); +int rte_timer_reset_v1902(struct rte_timer *tim, uint64_t ticks, + enum rte_timer_type type, unsigned int tim_lcore, + rte_timer_cb_t fct, void *arg); int rte_timer_reset(struct rte_timer *tim, uint64_t ticks, enum rte_timer_type type, unsigned tim_lcore, rte_timer_cb_t fct, void *arg); @@ -252,9 +315,10 @@ rte_timer_reset_sync(struct rte_timer *tim, uint64_t ticks, * - 0: Success; the timer is stopped. * - (-1): The timer is in the RUNNING or CONFIG state. */ +int rte_timer_stop_v20(struct rte_timer *tim); +int rte_timer_stop_v1902(struct rte_timer *tim); int rte_timer_stop(struct rte_timer *tim); - /** * Loop until rte_timer_stop() succeeds. * @@ -292,7 +356,25 @@ int rte_timer_pending(struct rte_timer *tim); * function. However, the more often the function is called, the more * CPU resources it will use. */ -void rte_timer_manage(void); +void rte_timer_manage_v20(void); + +/** + * Manage the timer list and execute callback functions. + * + * This function must be called periodically from EAL lcores + * main_loop(). It browses the list of pending timers and runs all + * timers that are expired. + * + * The precision of the timer depends on the call frequency of this + * function. However, the more often the function is called, the more + * CPU resources it will use. + * + * @return + * - 0: Success + * - -EINVAL: timer subsystem not yet initialized + */ +int rte_timer_manage_v1902(void); +int rte_timer_manage(void); /** * Dump statistics about timers. @@ -300,7 +382,143 @@ void rte_timer_manage(void); * @param f * A pointer to a file for output */ -void rte_timer_dump_stats(FILE *f); +void rte_timer_dump_stats_v20(FILE *f); + +/** + * Dump statistics about timers. + * + * @param f + * A pointer to a file for output + * @return + * - 0: Success + * - -EINVAL: timer subsystem not yet initialized + */ +int rte_timer_dump_stats_v1902(FILE *f); +int rte_timer_dump_stats(FILE *f); + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * This function is the same as rte_timer_reset(), except that it allows a + * caller to specify the rte_timer_data instance containing the list to which + * the timer should be added. + * + * @see rte_timer_reset() + * + * @param timer_data_id + * An identifier indicating which instance of timer data should be used for + * this operation. + * @param tim + * The timer handle. + * @param ticks + * The number of cycles (see rte_get_hpet_hz()) before the callback + * function is called. + * @param type + * The type can be either: + * - PERIODICAL: The timer is automatically reloaded after execution + * (returns to the PENDING state) + * - SINGLE: The timer is one-shot, that is, the timer goes to a + * STOPPED state after execution. + * @param tim_lcore + * The ID of the lcore where the timer callback function has to be + * executed. If tim_lcore is LCORE_ID_ANY, the timer library will + * launch it on a different core for each call (round-robin). + * @param fct + * The callback function of the timer. This parameter can be NULL if (and + * only if) rte_timer_alt_manage() will be used to manage this timer. + * @param arg + * The user argument of the callback function. + * @return + * - 0: Success; the timer is scheduled. + * - (-1): Timer is in the RUNNING or CONFIG state. + * - -EINVAL: invalid timer_data_id + */ +int __rte_experimental +rte_timer_alt_reset(uint32_t timer_data_id, struct rte_timer *tim, + uint64_t ticks, enum rte_timer_type type, + unsigned int tim_lcore, rte_timer_cb_t fct, void *arg); + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * This function is the same as rte_timer_stop(), except that it allows a + * caller to specify the rte_timer_data instance containing the list from which + * this timer should be removed. + * + * @see rte_timer_stop() + * + * @param timer_data_id + * An identifier indicating which instance of timer data should be used for + * this operation. + * @param tim + * The timer handle. + * @return + * - 0: Success; the timer is stopped. + * - (-1): The timer is in the RUNNING or CONFIG state. + * - -EINVAL: invalid timer_data_id + */ +int __rte_experimental +rte_timer_alt_stop(uint32_t timer_data_id, struct rte_timer *tim); + +/** + * Callback function type for rte_timer_alt_manage(). + */ +typedef void (*rte_timer_alt_manage_cb_t)(void *); + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * Manage a set of timer lists and execute the specified callback function for + * all expired timers. This function is similar to rte_timer_manage(), except + * that it allows a caller to specify the timer_data instance that should + * be operated on, as well as a set of lcore IDs identifying which timer lists + * should be processed. Callback functions of individual timers are ignored. + * + * @see rte_timer_manage() + * + * @param timer_data_id + * An identifier indicating which instance of timer data should be used for + * this operation. + * @param poll_lcores + * An array of lcore ids identifying the timer lists that should be processed. + * NULL is allowed - if NULL, the timer list corresponding to the lcore + * calling this routine is processed (same as rte_timer_manage()). + * @param n_poll_lcores + * The size of the poll_lcores array. If 'poll_lcores' is NULL, this parameter + * is ignored. + * @param f + * The callback function which should be called for all expired timers. + * @return + * - 0: success + * - -EINVAL: invalid timer_data_id + */ +int __rte_experimental +rte_timer_alt_manage(uint32_t timer_data_id, unsigned int *poll_lcores, + int n_poll_lcores, rte_timer_alt_manage_cb_t f); + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice + * + * This function is the same as rte_timer_dump_stats(), except that it allows + * the caller to specify the rte_timer_data instance that should be used. + * + * @see rte_timer_dump_stats() + * + * @param timer_data_id + * An identifier indicating which instance of timer data should be used for + * this operation. + * @param f + * A pointer to a file for output + * @return + * - 0: success + * - -EINVAL: invalid timer_data_id + */ +int __rte_experimental +rte_timer_alt_dump_stats(uint32_t timer_data_id, FILE *f); #ifdef __cplusplus } diff --git a/lib/librte_timer/rte_timer_version.map b/lib/librte_timer/rte_timer_version.map index 9b2e4b8..b3f4b6c 100644 --- a/lib/librte_timer/rte_timer_version.map +++ b/lib/librte_timer/rte_timer_version.map @@ -13,3 +13,25 @@ DPDK_2.0 { local: *; }; + +DPDK_19.02 { + global: + + rte_timer_dump_stats; + rte_timer_manage; + rte_timer_reset; + rte_timer_stop; + rte_timer_subsystem_init; +} DPDK_2.0; + +EXPERIMENTAL { + global: + + rte_timer_alt_dump_stats; + rte_timer_alt_manage; + rte_timer_alt_reset; + rte_timer_alt_stop; + rte_timer_data_alloc; + rte_timer_data_dealloc; + rte_timer_subsystem_finalize; +}; -- 2.6.4