From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f47.google.com (mail-pa0-f47.google.com [209.85.220.47]) by dpdk.org (Postfix) with ESMTP id BA4638D39 for ; Sun, 29 Nov 2015 19:46:49 +0100 (CET) Received: by pacej9 with SMTP id ej9so160650547pac.2 for ; Sun, 29 Nov 2015 10:46:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=+ukpvDXJOOO55KfTI0LQDe0UQUOYHKWg7iqYSNpGApI=; b=RFuYDwwGVHWALe2JRtTeXorkq9L/tJI3qZu8cunt5r5LKHhs75lPkSRcCHkn1EZTxs B3yTf9/L5o7zHt+zpIcABmX1xM+hp647nZ5tXO87MW6HUAE3lDO7nSq6TzYb0I8CuPUU cCQGT0JIsnFm5QhmHFuAPp/p5xrD2kDuS6IiVcBFs7/jINDcT1qegcp5XSEpniTdMrNp 8c73nMcjtVda/ejBgTKNht8Jf1irFrbOgbrRh8SZZ1onm9p5F5vj3pHZgGaqMlG6kf8s rqIJs9mv3zpdx72Qgt6emcIzlZCPGfC+1iZjtyYAYzYWn0/WIbnr67X8NT5UbxL3VrCH idLw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=+ukpvDXJOOO55KfTI0LQDe0UQUOYHKWg7iqYSNpGApI=; b=LxVTAgv4EKlsPSgzF1h3Pvjpj9uU88rlKXpFkCMNsRZZqcr6kypntUGirikWj5nblP 7/xoVB4AL5AmTI2XdcI91HKD0qwLNp7U7EinT16Kfr12IAFgpqSkXLN0HdlC2fec93So rXR78FfRkDAI6Uikc/9Q5wb/kL63IQbuxsoZM0oMO5PEP4K4Duv0VG7RJQwZTgnzNwxv WQ0lwmoo3G3jl3IxhcQgR2qX9ZtBj3t7k3yhJXZoJc3wvdMSS4kARMhaZlPEIxaBPGYy poGjLUXqr5+4/IJ0+uYooj8CyLOMsHI5fYzGTGd5okmWVM/QnCPmzvU4qARoyMevVMi+ iEGQ== X-Gm-Message-State: ALoCoQk+CZe8gtGfj4jUuxnyZqPIAbTm9xXtjCJsAz+K3/EWtnR//WaHXTLpyVvb/jx6nuSarMCt X-Received: by 10.98.0.195 with SMTP id 186mr65106141pfa.130.1448822809184; Sun, 29 Nov 2015 10:46:49 -0800 (PST) Received: from xeon-e3.home.lan (static-50-53-82-155.bvtn.or.frontiernet.net. [50.53.82.155]) by smtp.gmail.com with ESMTPSA id qn5sm46905832pac.41.2015.11.29.10.46.48 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Sun, 29 Nov 2015 10:46:48 -0800 (PST) From: Stephen Hemminger To: cristian.dumitrescu@intel.com Date: Sun, 29 Nov 2015 10:46:49 -0800 Message-Id: <1448822809-8350-4-git-send-email-stephen@networkplumber.org> X-Mailer: git-send-email 2.1.4 In-Reply-To: <1448822809-8350-1-git-send-email-stephen@networkplumber.org> References: <1448822809-8350-1-git-send-email-stephen@networkplumber.org> Cc: dev@dpdk.org Subject: [dpdk-dev] [PATCH 3/3] rte_sched: eliminate floating point in calculating byte clock X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 29 Nov 2015 18:46:50 -0000 The old code was doing a floating point divide for each rte_dequeue() which is very expensive. Change to using fixed point scaled inverse multiply. To maintain equivalent precision, scaled math is used. The application ABI is the same. This improved performance from 5Gbit/sec to 10 Gbit/sec when configured for 10 Gbit/sec rate. There was some feedback from Cristian that he wanted a better solution and was going to give one, but none was provided. For 2.2 this is a better solution than existing code, if someone has a better version I would love to see it. Signed-off-by: Stephen Hemminger --- lib/librte_sched/rte_sched.c | 23 ++++++++++++++++++----- 1 file changed, 18 insertions(+), 5 deletions(-) diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c index 16acd6b..cfae136 100644 --- a/lib/librte_sched/rte_sched.c +++ b/lib/librte_sched/rte_sched.c @@ -47,6 +47,7 @@ #include "rte_bitmap.h" #include "rte_sched_common.h" #include "rte_approx.h" +#include "rte_reciprocal.h" #ifdef __INTEL_COMPILER #pragma warning(disable:2259) /* conversion may lose significant bits */ @@ -62,6 +63,11 @@ #define RTE_SCHED_PIPE_INVALID UINT32_MAX #define RTE_SCHED_BMP_POS_INVALID UINT32_MAX +/* Scaling for cycles_per_byte calculation + * Chosen so that minimum rate is 480 bit/sec + */ +#define RTE_SCHED_TIME_SHIFT 8 + struct rte_sched_subport { /* Token bucket (TB) */ uint64_t tb_time; /* time of last update */ @@ -215,7 +221,7 @@ struct rte_sched_port { uint64_t time_cpu_cycles; /* Current CPU time measured in CPU cyles */ uint64_t time_cpu_bytes; /* Current CPU time measured in bytes */ uint64_t time; /* Current NIC TX time measured in bytes */ - double cycles_per_byte; /* CPU cycles per byte */ + struct rte_reciprocal inv_cycles_per_byte; /* CPU cycles per byte */ /* Scheduling loop detection */ uint32_t pipe_loop; @@ -610,7 +616,7 @@ struct rte_sched_port * rte_sched_port_config(struct rte_sched_port_params *params) { struct rte_sched_port *port = NULL; - uint32_t mem_size, bmp_mem_size, n_queues_per_port, i; + uint32_t mem_size, bmp_mem_size, n_queues_per_port, i, cycles_per_byte; /* Check user parameters. Determine the amount of memory to allocate */ mem_size = rte_sched_port_get_memory_footprint(params); @@ -661,7 +667,10 @@ rte_sched_port_config(struct rte_sched_port_params *params) port->time_cpu_cycles = rte_get_tsc_cycles(); port->time_cpu_bytes = 0; port->time = 0; - port->cycles_per_byte = ((double) rte_get_tsc_hz()) / ((double) params->rate); + + cycles_per_byte = (rte_get_tsc_hz() << RTE_SCHED_TIME_SHIFT) + / params->rate; + port->inv_cycles_per_byte = rte_reciprocal_value(cycles_per_byte); /* Scheduling loop detection */ port->pipe_loop = RTE_SCHED_PIPE_INVALID; @@ -2088,11 +2097,15 @@ rte_sched_port_time_resync(struct rte_sched_port *port) { uint64_t cycles = rte_get_tsc_cycles(); uint64_t cycles_diff = cycles - port->time_cpu_cycles; - double bytes_diff = ((double) cycles_diff) / port->cycles_per_byte; + uint64_t bytes_diff; + + /* Compute elapsed time in bytes */ + bytes_diff = rte_reciprocal_divide(cycles_diff << RTE_SCHED_TIME_SHIFT, + port->inv_cycles_per_byte); /* Advance port time */ port->time_cpu_cycles = cycles; - port->time_cpu_bytes += (uint64_t) bytes_diff; + port->time_cpu_bytes += bytes_diff; if (port->time < port->time_cpu_bytes) port->time = port->time_cpu_bytes; -- 2.1.4