From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from prod-mail-xrelay02.akamai.com (prod-mail-xrelay02.akamai.com [72.246.2.14]) by dpdk.org (Postfix) with ESMTP id 1325B5A73 for ; Fri, 27 Mar 2015 17:38:17 +0100 (CET) Received: from prod-mail-xrelay02.akamai.com (localhost [127.0.0.1]) by postfix.imss70 (Postfix) with ESMTP id 6A5CC285F6; Fri, 27 Mar 2015 16:38:16 +0000 (GMT) Received: from prod-mail-relay07.akamai.com (prod-mail-relay07.akamai.com [172.17.121.112]) by prod-mail-xrelay02.akamai.com (Postfix) with ESMTP id 563C5285F5; Fri, 27 Mar 2015 16:38:16 +0000 (GMT) Received: from email.msg.corp.akamai.com (usma1ex-casadmn.msg.corp.akamai.com [172.27.123.33]) by prod-mail-relay07.akamai.com (Postfix) with ESMTP id 508D6800E3; Fri, 27 Mar 2015 16:38:16 +0000 (GMT) Received: from USMA1EX-DAG1MB3.msg.corp.akamai.com (172.27.123.103) by usma1ex-dag1mb1.msg.corp.akamai.com (172.27.123.101) with Microsoft SMTP Server (TLS) id 15.0.913.22; Fri, 27 Mar 2015 12:38:02 -0400 Received: from USMA1EX-DAG1MB3.msg.corp.akamai.com ([172.27.123.103]) by usma1ex-dag1mb3.msg.corp.akamai.com ([172.27.123.103]) with mapi id 15.00.0913.011; Fri, 27 Mar 2015 12:38:02 -0400 From: "Sanford, Robert" To: "dev@dpdk.org" , "David.Marchand@6wind.com" , "Bruce.Richardson@intel.com" Thread-Topic: [dpdk-dev] [RFC PATCH] eal: rte_rand yields only 62 random bits Thread-Index: AQHQYM35q+guik72c0G3xEFSgoIeFZ0wly2A Date: Fri, 27 Mar 2015 16:38:01 +0000 Message-ID: References: <1426609081-47774-1-git-send-email-rsanford@akamai.com> In-Reply-To: <1426609081-47774-1-git-send-email-rsanford@akamai.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: user-agent: Microsoft-MacOutlook/14.4.3.140616 x-originating-ip: [172.19.132.52] Content-Type: text/plain; charset="us-ascii" Content-ID: <1190DC090F6C0348AB552CA930A767E1@akamai.com> Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [RFC PATCH] eal: rte_rand yields only 62 random bits X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Mar 2015 16:38:17 -0000 Never mind ... I cancel the previous suggestion. After further reading on RNGs, I believe we should abandon the use of lrand48() and implement our own RNG based on the so-called KISS family of RNGs originally proposed by the late George Marsaglia. In his excellent paper, "Good Practice in (Pseudo) Random Number Generation for Bioinformatics Applications", David Jones (UCL Bioinformatics Group) describes a few variants of KISS generators. This paper, and Robert G. Brown's (Duke Univ.) comprehensive "Dieharder" random number test suite work show that KISS RNGs are simple and fast, yet high quality. Something like JLKISS64(), with state kept in TLS, would be ideal for DPDK use. In limited experiments, I found JLKISS64() (not inlined, compiles to ~40 instructions) to be ~4 times faster than rte_rand(). This is probably because JLKISS64() achieves integer-instruction parallelism, while rte_rand(), with its two calls to lrand48(), nrand48_r(), and __drand48_iterate(), does not (and all those calls!). Here is the JLKISS64() function, as it appears in Jones's GoodPracticeRNG paper: --- =09 =09 =09 =09 =09 =09 =09 =09 /* Public domain code for JLKISS64 RNG - long period KISS RNG producing 64-bit results */ unsigned long long x =3D 123456789123ULL,y =3D 987654321987ULL; /* See= d variables */ unsigned int z1 =3D 43219876, c1 =3D 6543217, z2 =3D 21987643, c2 =3D 17326= 54; /* Seed variables */ unsigned long long JLKISS64() { unsigned long long t; x =3D 1490024343005336237ULL * x + 123456789; y ^=3D y << 21; y ^=3D y >> 17; y ^=3D y << 30; /* Do not set y=3D0! */ t =3D 4294584393ULL * z1 + c1; c1 =3D t >> 32; z1 =3D t; t =3D 4246477509ULL * z2 + c2; c2 =3D t >> 32; z2 =3D t; return x + y + z1 + ((unsigned long long)z2 << 32); /* Return 64-bit result */ } =09 =09 =09 =09 -- Regards, Robert >The implementation of rte_rand() returns only 62 bits of >pseudo-randomness, because the underlying calls to lrand48() >"return non-negative long integers uniformly distributed >between 0 and 2^31." > >We have written a potential fix, but before we spend more >time testing and refining it, I wanted to check with you >guys. > >We switched to using the reentrant versions of [ls]rand48, >and maintain per-lcore state. We need ~2.06 calls to >lrand48_r(), per call to rte_rand(). > >Do you agree with the approach we've taken in this patch? > > >Thanks, >Robert > >--- > lib/librte_eal/common/include/rte_random.h | 33 >+++++++++++++++++++++++++-- > lib/librte_eal/linuxapp/eal/eal_thread.c | 7 ++++++ > 2 files changed, 37 insertions(+), 3 deletions(-) > >diff --git a/lib/librte_eal/common/include/rte_random.h >b/lib/librte_eal/common/include/rte_random.h >index 24ae836..b9248cd 100644 >--- a/lib/librte_eal/common/include/rte_random.h >+++ b/lib/librte_eal/common/include/rte_random.h >@@ -46,6 +46,17 @@ extern "C" { >=20 > #include > #include >+#include >+#include >+ >+struct rte_rand_data { >+ struct drand48_data _dr48; >+ uint32_t _hi_bits; >+ uint8_t _bits_left; >+}; >+ >+RTE_DECLARE_PER_LCORE(struct rte_rand_data, _rand_data); >+ >=20 > /** > * Seed the pseudo-random generator. >@@ -60,7 +71,7 @@ extern "C" { > static inline void > rte_srand(uint64_t seedval) > { >- srand48((long unsigned int)seedval); >+ srand48_r((long unsigned int)seedval, &RTE_PER_LCORE(_rand_data)._dr48); > } >=20 > /** >@@ -76,10 +87,26 @@ rte_srand(uint64_t seedval) > static inline uint64_t > rte_rand(void) > { >+ struct rte_rand_data *rd =3D &RTE_PER_LCORE(_rand_data); > uint64_t val; >- val =3D lrand48(); >+ uint32_t hi_bits; >+ long int result; >+ >+ if (unlikely(rd->_bits_left < 2)) { >+ lrand48_r(&rd->_dr48, &result); >+ rd->_hi_bits |=3D (uint32_t)result << (1 - rd->_bits_left); >+ rd->_bits_left +=3D 31; >+ } >+ >+ hi_bits =3D rd->_hi_bits; >+ lrand48_r(&rd->_dr48, &result); >+ val =3D (uint32_t)result | (hi_bits & 0x8000000); > val <<=3D 32; >- val +=3D lrand48(); >+ hi_bits <<=3D 1; >+ lrand48_r(&rd->_dr48, &result); >+ val |=3D (uint32_t)result | (hi_bits & 0x8000000); >+ rd->_hi_bits =3D hi_bits << 1; >+ rd->_bits_left -=3D 2; > return val; > } >=20 >diff --git a/lib/librte_eal/linuxapp/eal/eal_thread.c >b/lib/librte_eal/linuxapp/eal/eal_thread.c >index 5635c7d..08e7f72 100644 >--- a/lib/librte_eal/linuxapp/eal/eal_thread.c >+++ b/lib/librte_eal/linuxapp/eal/eal_thread.c >@@ -52,6 +52,8 @@ > #include > #include > #include >+#include >+#include >=20 > #include "eal_private.h" > #include "eal_thread.h" >@@ -59,6 +61,8 @@ > RTE_DEFINE_PER_LCORE(unsigned, _lcore_id) =3D LCORE_ID_ANY; > RTE_DEFINE_PER_LCORE(unsigned, _socket_id) =3D (unsigned)SOCKET_ID_ANY; > RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset); >+RTE_DEFINE_PER_LCORE(struct rte_rand_data, _rand_data); >+ >=20 > /* > * Send a message to a slave lcore identified by slave_id to call a >@@ -147,6 +151,9 @@ eal_thread_loop(__attribute__((unused)) void *arg) > /* set the lcore ID in per-lcore memory area */ > RTE_PER_LCORE(_lcore_id) =3D lcore_id; >=20 >+ /* seed per-lcore PRNG */ >+ rte_srand(rte_rdtsc()); >+ > /* set CPU affinity */ > if (eal_thread_set_affinity() < 0) > rte_panic("cannot set affinity\n"); >--=20 >1.7.1 >