From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id 3ABECB635 for ; Mon, 16 Feb 2015 23:44:36 +0100 (CET) Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga102.jf.intel.com with ESMTP; 16 Feb 2015 14:40:34 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.09,590,1418112000"; d="scan'208";a="652977928" Received: from irsmsx105.ger.corp.intel.com ([163.33.3.28]) by orsmga001.jf.intel.com with ESMTP; 16 Feb 2015 14:44:33 -0800 Received: from irsmsx108.ger.corp.intel.com ([169.254.11.218]) by irsmsx105.ger.corp.intel.com ([169.254.7.117]) with mapi id 14.03.0195.001; Mon, 16 Feb 2015 22:44:31 +0000 From: "Dumitrescu, Cristian" To: Stephen Hemminger , "dev@dpdk.org" Thread-Topic: [dpdk-dev] [PATCH v2 6/7] rte_sched: eliminate floating point in calculating byte clock Thread-Index: AQHQQQsT144+PZhSAUCRPCvI/wyA+pzz6yhw Date: Mon, 16 Feb 2015 22:44:31 +0000 Message-ID: <3EB4FA525960D640B5BDFFD6A3D8912632318070@IRSMSX108.ger.corp.intel.com> References: <1423116841-19799-4-git-send-email-stephen@networkplumber.org> <1423116841-19799-6-git-send-email-stephen@networkplumber.org> In-Reply-To: <1423116841-19799-6-git-send-email-stephen@networkplumber.org> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [163.33.239.181] Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Cc: Stephen Hemminger Subject: Re: [dpdk-dev] [PATCH v2 6/7] rte_sched: eliminate floating point in calculating byte clock X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 16 Feb 2015 22:44:36 -0000 Hi Stephen, Sorry, NACK. 1. Overflow issue As you declare cycles_per_byte as uint32_t, for a CPU frequency of 2-3 GHz,= the line of code below results in overflow: port->cycles_per_byte =3D (rte_get_tsc_hz() << RTE_SCHED_TIME_SHIFT) / par= ams->rate; Therefore, there is most likely a significant accuracy loss, which might re= sult in more packets allowed to go out than it should. 2. Integer division has a higher cost than floating point division My understanding is we are considering a performance improvement by replaci= ng the double precision floating point division in: double bytes_diff =3D ((double) cycles_diff) / port->cycles_per_byte; with an integer division: uint64_t bytes_diff =3D (cycles_diff << RTE_SCHED_TIME_SHIFT) / port->cycl= es_per_byte; I don't think this is going to have the claimed benefit, as acording to "In= tel 64 and IA-32 Architectures Optimization Reference Manual" (Appendix C)= , the latency of the integer division instruction is significantly bigger t= han the latency of integer division: Instruction FDIV double precision: latency =3D 38-40 cycles Instruction IDIV: latency =3D 56 - 80 cycles 3. Alternative I hear though your suggestion about replacing the floating point division w= ith a more performant construction. One suggestion would be to replace it w= ith an integer multiplication followed by a shift right, probably by using = a uint64_t bytes_per_cycle_scaled_up (the inverse of cycles_per_bytes). I n= eed to prototype this code myself. Would you be OK to look into providing a= n alternative implementation? Thanks, Cristian -----Original Message----- From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Stephen Hemminger Sent: Thursday, February 5, 2015 6:14 AM To: dev@dpdk.org Cc: Stephen Hemminger Subject: [dpdk-dev] [PATCH v2 6/7] rte_sched: eliminate floating point in c= alculating byte clock From: Stephen Hemminger The old code was doing a floating point divide for each rte_dequeue() which is very expensive. Change to using fixed point scaled math instead. This improved performance from 5Gbit/sec to 10 Gbit/sec Signed-off-by: Stephen Hemminger --- lib/librte_sched/rte_sched.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c index 55fbc14..3023457 100644 --- a/lib/librte_sched/rte_sched.c +++ b/lib/librte_sched/rte_sched.c @@ -102,6 +102,9 @@ = #define RTE_SCHED_BMP_POS_INVALID UINT32_MAX = +/* For cycles_per_byte calculation */ +#define RTE_SCHED_TIME_SHIFT 20 + struct rte_sched_subport { /* Token bucket (TB) */ uint64_t tb_time; /* time of last update */ @@ -239,7 +242,7 @@ struct rte_sched_port { uint64_t time_cpu_cycles; /* Current CPU time measured in CPU cyles */ uint64_t time_cpu_bytes; /* Current CPU time measured in bytes */ uint64_t time; /* Current NIC TX time measured in bytes */ - double cycles_per_byte; /* CPU cycles per byte */ + uint32_t cycles_per_byte; /* CPU cycles per byte (scaled) */ = /* Scheduling loop detection */ uint32_t pipe_loop; @@ -657,7 +660,9 @@ rte_sched_port_config(struct rte_sched_port_params *par= ams) port->time_cpu_cycles =3D rte_get_tsc_cycles(); port->time_cpu_bytes =3D 0; port->time =3D 0; - port->cycles_per_byte =3D ((double) rte_get_tsc_hz()) / ((double) params-= >rate); + + port->cycles_per_byte =3D (rte_get_tsc_hz() << RTE_SCHED_TIME_SHIFT) + / params->rate; = /* Scheduling loop detection */ port->pipe_loop =3D RTE_SCHED_PIPE_INVALID; @@ -2156,11 +2161,12 @@ rte_sched_port_time_resync(struct rte_sched_port *p= ort) { uint64_t cycles =3D rte_get_tsc_cycles(); uint64_t cycles_diff =3D cycles - port->time_cpu_cycles; - double bytes_diff =3D ((double) cycles_diff) / port->cycles_per_byte; + uint64_t bytes_diff =3D (cycles_diff << RTE_SCHED_TIME_SHIFT) + / port->cycles_per_byte; = /* Advance port time */ port->time_cpu_cycles =3D cycles; - port->time_cpu_bytes +=3D (uint64_t) bytes_diff; + port->time_cpu_bytes +=3D bytes_diff; if (port->time < port->time_cpu_bytes) { port->time =3D port->time_cpu_bytes; } -- = 2.1.4 -------------------------------------------------------------- Intel Shannon Limited Registered in Ireland Registered Office: Collinstown Industrial Park, Leixlip, County Kildare Registered Number: 308263 Business address: Dromore House, East Park, Shannon, Co. Clare This e-mail and any attachments may contain confidential material for the s= ole use of the intended recipient(s). Any review or distribution by others = is strictly prohibited. If you are not the intended recipient, please conta= ct the sender and delete all copies.