From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id 6BF02C39A for ; Mon, 20 Jul 2015 05:02:53 +0200 (CEST) Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga101.jf.intel.com with ESMTP; 19 Jul 2015 20:02:53 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.15,506,1432623600"; d="scan'208";a="609151003" Received: from shvmail01.sh.intel.com ([10.239.29.42]) by orsmga003.jf.intel.com with ESMTP; 19 Jul 2015 20:02:51 -0700 Received: from shecgisg004.sh.intel.com (shecgisg004.sh.intel.com [10.239.29.89]) by shvmail01.sh.intel.com with ESMTP id t6K32nl5030007; Mon, 20 Jul 2015 11:02:49 +0800 Received: from shecgisg004.sh.intel.com (localhost [127.0.0.1]) by shecgisg004.sh.intel.com (8.13.6/8.13.6/SuSE Linux 0.8) with ESMTP id t6K32kEc003632; Mon, 20 Jul 2015 11:02:48 +0800 Received: (from cliang18@localhost) by shecgisg004.sh.intel.com (8.13.6/8.13.6/Submit) id t6K32kdJ003628; Mon, 20 Jul 2015 11:02:46 +0800 From: Cunming Liang To: dev@dpdk.org, thomas.monjalon@6wind.com Date: Mon, 20 Jul 2015 11:02:18 +0800 Message-Id: <1437361349-2801-3-git-send-email-cunming.liang@intel.com> X-Mailer: git-send-email 1.7.4.1 In-Reply-To: <1437361349-2801-1-git-send-email-cunming.liang@intel.com> References: <1437113775-32199-1-git-send-email-cunming.liang@intel.com> <1437361349-2801-1-git-send-email-cunming.liang@intel.com> Cc: shemming@brocade.com Subject: [dpdk-dev] [PATCH v15 02/13] eal/linux: add rte_epoll_wait/ctl support X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 20 Jul 2015 03:02:54 -0000 The patch adds 'rte_epoll_wait' and 'rte_epoll_ctl' for async event wakeup. It defines 'struct rte_epoll_event' as the event param. When the event fds add to a specified epoll instance, 'eptrs' will hold the rte_epoll_event object pointer. The 'op' uses the same enum as epoll_wait/ctl does. The epoll event support to carry a raw user data and to register a callback which is executed during wakeup. Signed-off-by: Cunming Liang --- v14 changes - per-patch basis ABI compatibility rework - remove unnecessary 'local: *' from version map v13 changes - version map cleanup for v2.1 v11 changes - cleanup spelling error v9 changes - rework on coding style v8 changes - support delete event in safety during the wakeup execution - add EINTR process during epoll_wait v7 changes - split v6[4/8] into two patches, one for epoll event(this one) another for rx intr(next patch) - introduce rte_epoll_event definition - rte_epoll_wait/ctl for more generic RTE epoll API v6 changes - split rte_intr_wait_rx_pkt into two function, wait and set. - rewrite rte_intr_rx_wait/rte_intr_rx_set to remove queue visibility on eal. - rte_intr_rx_wait to support multiplexing. - allow epfd as input to support flexible event fd combination. lib/librte_eal/linuxapp/eal/eal_interrupts.c | 139 +++++++++++++++++++++ .../linuxapp/eal/include/exec-env/rte_interrupts.h | 80 ++++++++++++ lib/librte_eal/linuxapp/eal/rte_eal_version.map | 3 + 3 files changed, 222 insertions(+) diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c b/lib/librte_eal/linuxapp/eal/eal_interrupts.c index 61e7c85..55be263 100644 --- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c +++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c @@ -69,6 +69,8 @@ #define EAL_INTR_EPOLL_WAIT_FOREVER (-1) +static RTE_DEFINE_PER_LCORE(int, _epfd) = -1; /**< epoll fd per thread */ + /** * union for pipe fds. */ @@ -896,3 +898,140 @@ rte_eal_intr_init(void) return -ret; } + +static int +eal_epoll_process_event(struct epoll_event *evs, unsigned int n, + struct rte_epoll_event *events) +{ + unsigned int i, count = 0; + struct rte_epoll_event *rev; + + for (i = 0; i < n; i++) { + rev = evs[i].data.ptr; + if (!rev || !rte_atomic32_cmpset(&rev->status, RTE_EPOLL_VALID, + RTE_EPOLL_EXEC)) + continue; + + events[count].status = RTE_EPOLL_VALID; + events[count].fd = rev->fd; + events[count].epfd = rev->epfd; + events[count].epdata.event = rev->epdata.event; + events[count].epdata.data = rev->epdata.data; + if (rev->epdata.cb_fun) + rev->epdata.cb_fun(rev->fd, + rev->epdata.cb_arg); + + rte_compiler_barrier(); + rev->status = RTE_EPOLL_VALID; + count++; + } + return count; +} + +static inline int +eal_init_tls_epfd(void) +{ + int pfd = epoll_create(255); + + if (pfd < 0) { + RTE_LOG(ERR, EAL, + "Cannot create epoll instance\n"); + return -1; + } + return pfd; +} + +int +rte_intr_tls_epfd(void) +{ + if (RTE_PER_LCORE(_epfd) == -1) + RTE_PER_LCORE(_epfd) = eal_init_tls_epfd(); + + return RTE_PER_LCORE(_epfd); +} + +int +rte_epoll_wait(int epfd, struct rte_epoll_event *events, + int maxevents, int timeout) +{ + struct epoll_event evs[maxevents]; + int rc; + + if (!events) { + RTE_LOG(ERR, EAL, "rte_epoll_event can't be NULL\n"); + return -1; + } + + /* using per thread epoll fd */ + if (epfd == RTE_EPOLL_PER_THREAD) + epfd = rte_intr_tls_epfd(); + + while (1) { + rc = epoll_wait(epfd, evs, maxevents, timeout); + if (likely(rc > 0)) { + /* epoll_wait has at least one fd ready to read */ + rc = eal_epoll_process_event(evs, rc, events); + break; + } else if (rc < 0) { + if (errno == EINTR) + continue; + /* epoll_wait fail */ + RTE_LOG(ERR, EAL, "epoll_wait returns with fail %s\n", + strerror(errno)); + rc = -1; + break; + } + } + + return rc; +} + +static inline void +eal_epoll_data_safe_free(struct rte_epoll_event *ev) +{ + while (!rte_atomic32_cmpset(&ev->status, RTE_EPOLL_VALID, + RTE_EPOLL_INVALID)) + while (ev->status != RTE_EPOLL_VALID) + rte_pause(); + memset(&ev->epdata, 0, sizeof(ev->epdata)); + ev->fd = -1; + ev->epfd = -1; +} + +int +rte_epoll_ctl(int epfd, int op, int fd, + struct rte_epoll_event *event) +{ + struct epoll_event ev; + + if (!event) { + RTE_LOG(ERR, EAL, "rte_epoll_event can't be NULL\n"); + return -1; + } + + /* using per thread epoll fd */ + if (epfd == RTE_EPOLL_PER_THREAD) + epfd = rte_intr_tls_epfd(); + + if (op == EPOLL_CTL_ADD) { + event->status = RTE_EPOLL_VALID; + event->fd = fd; /* ignore fd in event */ + event->epfd = epfd; + ev.data.ptr = (void *)event; + } + + ev.events = event->epdata.event; + if (epoll_ctl(epfd, op, fd, &ev) < 0) { + RTE_LOG(ERR, EAL, "Error op %d fd %d epoll_ctl, %s\n", + op, fd, strerror(errno)); + if (op == EPOLL_CTL_ADD) + /* rollback status when CTL_ADD fail */ + event->status = RTE_EPOLL_INVALID; + return -1; + } + + if (op == EPOLL_CTL_DEL && event->status != RTE_EPOLL_INVALID) + eal_epoll_data_safe_free(event); + + return 0; +} diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h index ac33eda..886608c 100644 --- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h +++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h @@ -51,6 +51,32 @@ enum rte_intr_handle_type { RTE_INTR_HANDLE_MAX }; +#define RTE_INTR_EVENT_ADD 1UL +#define RTE_INTR_EVENT_DEL 2UL + +typedef void (*rte_intr_event_cb_t)(int fd, void *arg); + +struct rte_epoll_data { + uint32_t event; /**< event type */ + void *data; /**< User data */ + rte_intr_event_cb_t cb_fun; /**< IN: callback fun */ + void *cb_arg; /**< IN: callback arg */ +}; + +enum { + RTE_EPOLL_INVALID = 0, + RTE_EPOLL_VALID, + RTE_EPOLL_EXEC, +}; + +/** interrupt epoll event obj, taken by epoll_event.ptr */ +struct rte_epoll_event { + volatile uint32_t status; /**< OUT: event status */ + int fd; /**< OUT: event fd */ + int epfd; /**< OUT: epoll instance the ev associated with */ + struct rte_epoll_data epdata; +}; + /** Handle for interrupts. */ struct rte_intr_handle { union { @@ -64,8 +90,62 @@ struct rte_intr_handle { uint32_t max_intr; /**< max interrupt requested */ uint32_t nb_efd; /**< number of available efd(event fd) */ int efds[RTE_MAX_RXTX_INTR_VEC_ID]; /**< intr vectors/efds mapping */ + struct rte_epoll_event elist[RTE_MAX_RXTX_INTR_VEC_ID]; + /**< intr vector epoll event */ int *intr_vec; /**< intr vector number array */ #endif }; +#define RTE_EPOLL_PER_THREAD -1 /**< to hint using per thread epfd */ + +/** + * It waits for events on the epoll instance. + * + * @param epfd + * Epoll instance fd on which the caller wait for events. + * @param events + * Memory area contains the events that will be available for the caller. + * @param maxevents + * Up to maxevents are returned, must greater than zero. + * @param timeout + * Specifying a timeout of -1 causes a block indefinitely. + * Specifying a timeout equal to zero cause to return immediately. + * @return + * - On success, returns the number of available event. + * - On failure, a negative value. + */ +int +rte_epoll_wait(int epfd, struct rte_epoll_event *events, + int maxevents, int timeout); + +/** + * It performs control operations on epoll instance referred by the epfd. + * It requests that the operation op be performed for the target fd. + * + * @param epfd + * Epoll instance fd on which the caller perform control operations. + * @param op + * The operation be performed for the target fd. + * @param fd + * The target fd on which the control ops perform. + * @param event + * Describes the object linked to the fd. + * Note: The caller must take care the object deletion after CTL_DEL. + * @return + * - On success, zero. + * - On failure, a negative value. + */ +int +rte_epoll_ctl(int epfd, int op, int fd, + struct rte_epoll_event *event); + +/** + * The function returns the per thread epoll instance. + * + * @return + * epfd the epoll instance referred to. + */ +int +rte_intr_tls_epfd(void); + #endif /* _RTE_LINUXAPP_INTERRUPTS_H_ */ diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map index b2d4441..39cc2d2 100644 --- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map +++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map @@ -116,6 +116,9 @@ DPDK_2.1 { global: rte_eal_pci_detach; + rte_epoll_ctl; + rte_epoll_wait; + rte_intr_tls_epfd; rte_memzone_free; } DPDK_2.0; -- 1.8.1.4