From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 8C6CBA046B for ; Thu, 22 Aug 2019 02:35:07 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id CCEE11BF25; Thu, 22 Aug 2019 02:35:06 +0200 (CEST) Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by dpdk.org (Postfix) with ESMTP id 285A591 for ; Thu, 22 Aug 2019 02:35:04 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga104.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 21 Aug 2019 17:35:03 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.64,414,1559545200"; d="scan'208";a="330199324" Received: from jrharri1-skx.ch.intel.com (HELO [127.0.1.1]) ([143.182.137.73]) by orsmga004.jf.intel.com with ESMTP; 21 Aug 2019 17:35:03 -0700 From: Jim Harris To: dev@dpdk.org, anatoly.burakov@intel.com Date: Wed, 21 Aug 2019 10:29:58 -0700 Message-ID: <156640859805.13687.5171485731407143100.stgit@jrharri1-skx> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Subject: [dpdk-dev] [PATCH v2] timer: use rte_mp_msg to get freq from primary process X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Ideally, get_tsc_freq_arch() is able to provide the TSC rate using architecture-specific means. When that is not possible, DPDK reverts to calculating the TSC rate with a 100ms nanosleep or 1s sleep. The latter occurs more frequently in VMs which often do not have access to the data they need from arch-specific means (CPUID leaf 0x15 or MSR 0xCE on x86). In secondary processes, the extra 100ms is especially noticeable and consumes the bulk of rte_eal_init() execution time. So in secondary processes, if we cannot get the TSC rate using get_tsc_freq_arch(), try to get the TSC rate from the primary process instead using rte_mp_msg. This is much faster than 100ms. Reduces rte_eal_init() execution time in a secondary process from 165ms to 66ms on my test system. Signed-off-by: Jim Harris Change-Id: I584419ed1c7d6f47841e0a0eb23f34c9f1186d35 --- lib/librte_eal/common/eal_common_timer.c | 62 ++++++++++++++++++++++++++++++ 1 file changed, 62 insertions(+) diff --git a/lib/librte_eal/common/eal_common_timer.c b/lib/librte_eal/common/eal_common_timer.c index 145543de7..ad965455d 100644 --- a/lib/librte_eal/common/eal_common_timer.c +++ b/lib/librte_eal/common/eal_common_timer.c @@ -15,9 +15,17 @@ #include #include #include +#include +#include #include "eal_private.h" +#define EAL_TIMER_MP "eal_timer_mp_sync" + +struct timer_mp_param { + uint64_t tsc_hz; +}; + /* The frequency of the RDTSC timer resolution */ static uint64_t eal_tsc_resolution_hz; @@ -74,12 +82,58 @@ estimate_tsc_freq(void) return RTE_ALIGN_MUL_NEAR(rte_rdtsc() - start, CYC_PER_10MHZ); } +static uint64_t +get_tsc_freq_from_primary(void) +{ + struct rte_mp_msg mp_req = {0}; + struct rte_mp_reply mp_reply = {0}; + struct timer_mp_param *r; + struct timespec ts = {.tv_sec = 1, .tv_nsec = 0}; + uint64_t tsc_hz; + + strcpy(mp_req.name, EAL_TIMER_MP); + if (rte_mp_request_sync(&mp_req, &mp_reply, &ts) || + mp_reply.nb_received != 1) { + tsc_hz = 0; + } else { + r = (struct timer_mp_param *)mp_reply.msgs[0].param; + tsc_hz = r->tsc_hz; + } + + free(mp_reply.msgs); + return tsc_hz; +} + +static int +timer_mp_primary(__attribute__((unused)) const struct rte_mp_msg *msg, + const void *peer) +{ + struct rte_mp_msg reply = {0}; + struct timer_mp_param *r = (struct timer_mp_param *)reply.param; + + r->tsc_hz = eal_tsc_resolution_hz; + strcpy(reply.name, EAL_TIMER_MP); + reply.len_param = sizeof(*r); + + return rte_mp_reply(&reply, peer); +} + void set_tsc_freq(void) { uint64_t freq; + int rc; freq = get_tsc_freq_arch(); + if (!freq && rte_eal_process_type() != RTE_PROC_PRIMARY) { + /* We couldn't get the TSC frequency through arch-specific + * means. If this is a secondary process, try to get the + * TSC frequency from the primary process - this will + * be much faster than get_tsc_freq() or estimate_tsc_freq() + * below. + */ + freq = get_tsc_freq_from_primary(); + } if (!freq) freq = get_tsc_freq(); if (!freq) @@ -87,6 +141,14 @@ set_tsc_freq(void) RTE_LOG(DEBUG, EAL, "TSC frequency is ~%" PRIu64 " KHz\n", freq / 1000); eal_tsc_resolution_hz = freq; + if (rte_eal_process_type() == RTE_PROC_PRIMARY) { + rc = rte_mp_action_register(EAL_TIMER_MP, timer_mp_primary); + if (rc && rte_errno != ENOTSUP) { + RTE_LOG(WARNING, EAL, "Could not register mp_action - " + "secondary processes will calculate TSC rate " + "independently.\n"); + } + } } void rte_delay_us_callback_register(void (*userfunc)(unsigned int))