From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from out4-smtp.messagingengine.com (out4-smtp.messagingengine.com [66.111.4.28]) by dpdk.org (Postfix) with ESMTP id CF1EC5913 for ; Wed, 2 Dec 2015 17:57:05 +0100 (CET) Received: from compute2.internal (compute2.nyi.internal [10.202.2.42]) by mailout.nyi.internal (Postfix) with ESMTP id 02439209A5 for ; Wed, 2 Dec 2015 11:57:03 -0500 (EST) Received: from web1 ([10.202.2.211]) by compute2.internal (MEProxy); Wed, 02 Dec 2015 11:57:04 -0500 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= stressinduktion.org; h=cc:content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-sasl-enc:x-sasl-enc; s=mesmtp; bh=5EoOWNoFh0spROrc pZcyoG4qH3s=; b=oRd7bhNuWPiqzZCxtbA3Ydjt+Zw2Jw1rzZ9dHuabQaYKm6hY i47aDuPMfRGhjr2QmKZoiAWXlWsabAh9LuBdixoRZ70SMXfcWQ9nMbMLwJZiGlwk sUHycpTDiimZc4XniS1A+MFUG6Ux143AXRXODi7a6pndMbWbrVDYTdZmCks= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-sasl-enc:x-sasl-enc; s=smtpout; bh=5EoOWNoFh0spROr cpZcyoG4qH3s=; b=jToO1rC1BoD/afDqm6/adzIHSu23QgB5ZzdWT3MwYFjqc9w RxmhxTu7e7UvZvsSHmS3X3MpEkjYS/dKwIsb7rztB4h33p2AwkfTGfrjIt8Gz3/1 YOtH36Pxrtpxx+TBfAIqZ66C2Ef1D4ZrVWldLXruA4C117MIgCnX82EME10M= Received: by web1.nyi.internal (Postfix, from userid 99) id B9A69AF0101; Wed, 2 Dec 2015 11:57:03 -0500 (EST) Message-Id: <1449075423.3815403.455981761.77105895@webmail.messagingengine.com> X-Sasl-Enc: Fzt7LcE4Vpy9MFCKLmlK8GgV+adeikbXUVa5V9IcWe4s 1449075423 From: Hannes Frederic Sowa To: "Dumitrescu, Cristian" , Stephen Hemminger MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="UTF-8" X-Mailer: MessagingEngine.com Webmail Interface - ajax-b94e6169 In-Reply-To: <3EB4FA525960D640B5BDFFD6A3D8912647925BB6@IRSMSX108.ger.corp.intel.com> References: <1448822809-8350-1-git-send-email-stephen@networkplumber.org> <1448822809-8350-3-git-send-email-stephen@networkplumber.org> <3EB4FA525960D640B5BDFFD6A3D8912647925BB6@IRSMSX108.ger.corp.intel.com> Date: Wed, 02 Dec 2015 17:57:03 +0100 Cc: dev@dpdk.org Subject: Re: [dpdk-dev] [PATCH 2/3] rte_sched: introduce reciprocal divide X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Dec 2015 16:57:06 -0000 Hello, On Wed, Dec 2, 2015, at 17:45, Dumitrescu, Cristian wrote: > > diff --git a/lib/librte_sched/rte_reciprocal.h > > b/lib/librte_sched/rte_reciprocal.h > > new file mode 100644 > > index 0000000..abd1525 > > --- /dev/null > > +++ b/lib/librte_sched/rte_reciprocal.h > > @@ -0,0 +1,39 @@ > > +/* > > + * Reciprocal divide > > + * > > + * Used with permission from original authors > > + * Hannes Frederic Sowa and Daniel Borkmann > > + * > > + * This algorithm is based on the paper "Division by Invariant > > + * Integers Using Multiplication" by Torbj=C3=83=C2=B6rn Granlund and = Peter > > + * L. Montgomery. >=20 > Stephen, can you please provide a link to this paper? > > + * > > + * The assembler implementation from Agner Fog, which this code is > > + * based on, can be found here: > > + * http://www.agner.org/optimize/asmlib.zip > > + * > > + * This optimization for A/B is helpful if the divisor B is mostly > > + * runtime invariant. The reciprocal of B is calculated in the > > + * slow-path with reciprocal_value(). The fast-path can then just use > > + * a much faster multiplication operation with a variable dividend A > > + * to calculate the division A/B. > > + */ > > + > > +#ifndef _RTE_RECIPROCAL_H_ > > +#define _RTE_RECIPROCAL_H_ > > + > > +struct rte_reciprocal { > > + uint32_t m; > > + uint8_t sh1, sh2; > > +}; >=20 > The size of this structure is not a multiple of 32 bits. You seem to > transfer this structure by value rather than by reference (the function > rte_reciprocal_value() below returns an instance of this structure), I > don't feel comfortable with the last 16 bits of the structure being left > uninitialized, we should probably add some explicit pad field and > initialize this structure explicitly to zero at init time? Note, it is used by static inline functions in fast path which most probably expands the code in question, thus no real argument passing happens (at least this is what I saw in the linux kernel assembly). I don't think you need to worry about padding. It happens very often without noticing. ;) > > + > > +static inline uint32_t rte_reciprocal_divide(uint32_t a, struct rte_re= ciprocal > > R) > > +{ > > + uint32_t t =3D (uint32_t)(((uint64_t)a * R.m) >> 32); > > + > > + return (t + ((a - t) >> R.sh1)) >> R.sh2; > > +} > > + > > +struct rte_reciprocal rte_reciprocal_value(uint32_t d); >=20 > Why 32-bit arithmetic? We had a lot of bugs in librte_sched library due > to 32-bit arithmetic that were particularly difficult to track. Can we > have this function rte_reciprocal_divide() return a 64-bit integer and > replace any 32-bit arithmetic/conversion with 64-bit operations? There was no use case at this time and I am actually not sure how easy the move to 64 bit is, as it would require one multiplication operation in an integer domain twice as large. > > + > > +#endif /* _RTE_RECIPROCAL_H_ */ > > -- > > 2.1.4 >=20 > As previously discussed, a simpler/faster alternative to floating point > division is 64-bit multiplication followed by right shift, any particular > reason why this approach was not considered? This is exact division. It depends on what you want. I am not sure if you want to do array accesses with floating point division or simple 64 bit multiplication and shifting. Bye, Hannes