From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pd0-f173.google.com (mail-pd0-f173.google.com [209.85.192.173]) by dpdk.org (Postfix) with ESMTP id 7D40D9AB5 for ; Tue, 10 Mar 2015 17:13:52 +0100 (CET) Received: by pdno5 with SMTP id o5so3434895pdn.1 for ; Tue, 10 Mar 2015 09:13:51 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=lmQNrlgmWDpgKOcCkoCt4Tj+CcANC+jDVODQh4nFvtc=; b=U/dKFdEk87Jrx0RZkHb/xosIFjjnv3I4wqm4TKcl4TSdQu73yaWEiRUhxjWMXzksdK GIHmtyY3znvZMnIMjwtlCyfvEEA1QnEUVk2CBohlNNf/WR1RVmUjy+znF/ep+eASkEiw vZB/jRBFUiv2DmhBHscd9PVnZVbhdsjcAQvBlbWU3tZ+vzVYZLiXj0FkVbtaDUJtvEYE SGjNWiN2J86pALtgmE/2PljsUQR8JfaSoeJnj3lOlxGmy98Spb3hdjOssRY9JD8f2AmZ b0QQcgIYJsLA16o3vnUN0GiHenoJhrt2hC8zJlEoyQBHD/t1+nfuwrKyL7E/FKhCMqgy g4Dg== X-Gm-Message-State: ALoCoQkPpVdF1h67+PRUat1n6lrrFw4Mfk7I57xfIM7ew/MfQQJYTJYngMxlPSv2myrGO1ziBVzU X-Received: by 10.66.221.135 with SMTP id qe7mr66186355pac.97.1426004031725; Tue, 10 Mar 2015 09:13:51 -0700 (PDT) Received: from urahara.brocade.com (static-50-53-82-155.bvtn.or.frontiernet.net. [50.53.82.155]) by mx.google.com with ESMTPSA id c8sm1836209pds.5.2015.03.10.09.13.50 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 10 Mar 2015 09:13:50 -0700 (PDT) From: Stephen Hemminger To: cristian.dumitrescu@intel.com Date: Tue, 10 Mar 2015 09:13:38 -0700 Message-Id: <1426004018-25948-7-git-send-email-stephen@networkplumber.org> X-Mailer: git-send-email 2.1.4 In-Reply-To: <1426004018-25948-1-git-send-email-stephen@networkplumber.org> References: <1426004018-25948-1-git-send-email-stephen@networkplumber.org> Cc: dev@dpdk.org, Stephen Hemminger Subject: [dpdk-dev] [PATCH v2 6/6] rte_sched: eliminate floating point in calculating byte clock X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 10 Mar 2015 16:13:52 -0000 From: Stephen Hemminger The old code was doing a floating point divide for each rte_dequeue() which is very expensive. Change to using fixed point scaled math instead. This improved performance from 5Gbit/sec to 10 Gbit/sec Signed-off-by: Stephen Hemminger --- v2 -- no changes despite objections, the performance observation is real on Intel(R) Core(TM) i7-3770 CPU lib/librte_sched/rte_sched.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c index 74d0e0a..522a647 100644 --- a/lib/librte_sched/rte_sched.c +++ b/lib/librte_sched/rte_sched.c @@ -102,6 +102,9 @@ #define RTE_SCHED_BMP_POS_INVALID UINT32_MAX +/* For cycles_per_byte calculation */ +#define RTE_SCHED_TIME_SHIFT 20 + struct rte_sched_subport { /* Token bucket (TB) */ uint64_t tb_time; /* time of last update */ @@ -239,7 +242,7 @@ struct rte_sched_port { uint64_t time_cpu_cycles; /* Current CPU time measured in CPU cyles */ uint64_t time_cpu_bytes; /* Current CPU time measured in bytes */ uint64_t time; /* Current NIC TX time measured in bytes */ - double cycles_per_byte; /* CPU cycles per byte */ + uint32_t cycles_per_byte; /* CPU cycles per byte (scaled) */ /* Scheduling loop detection */ uint32_t pipe_loop; @@ -657,7 +660,9 @@ rte_sched_port_config(struct rte_sched_port_params *params) port->time_cpu_cycles = rte_get_tsc_cycles(); port->time_cpu_bytes = 0; port->time = 0; - port->cycles_per_byte = ((double) rte_get_tsc_hz()) / ((double) params->rate); + + port->cycles_per_byte = (rte_get_tsc_hz() << RTE_SCHED_TIME_SHIFT) + / params->rate; /* Scheduling loop detection */ port->pipe_loop = RTE_SCHED_PIPE_INVALID; @@ -2126,11 +2131,12 @@ rte_sched_port_time_resync(struct rte_sched_port *port) { uint64_t cycles = rte_get_tsc_cycles(); uint64_t cycles_diff = cycles - port->time_cpu_cycles; - double bytes_diff = ((double) cycles_diff) / port->cycles_per_byte; + uint64_t bytes_diff = (cycles_diff << RTE_SCHED_TIME_SHIFT) + / port->cycles_per_byte; /* Advance port time */ port->time_cpu_cycles = cycles; - port->time_cpu_bytes += (uint64_t) bytes_diff; + port->time_cpu_bytes += bytes_diff; if (port->time < port->time_cpu_bytes) { port->time = port->time_cpu_bytes; } -- 2.1.4