From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id E13C042550; Sat, 9 Sep 2023 08:45:23 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 8B16B40295; Sat, 9 Sep 2023 08:45:21 +0200 (CEST) Received: from mail.lysator.liu.se (mail.lysator.liu.se [130.236.254.3]) by mails.dpdk.org (Postfix) with ESMTP id D42CC4027F for ; Sat, 9 Sep 2023 08:45:20 +0200 (CEST) Received: from mail.lysator.liu.se (localhost [127.0.0.1]) by mail.lysator.liu.se (Postfix) with ESMTP id BF1411F89E for ; Sat, 9 Sep 2023 08:45:19 +0200 (CEST) Received: by mail.lysator.liu.se (Postfix, from userid 1004) id BD92B1F56F; Sat, 9 Sep 2023 08:45:19 +0200 (CEST) X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on hermod.lysator.liu.se X-Spam-Level: X-Spam-Status: No, score=-2.3 required=5.0 tests=ALL_TRUSTED, AWL, NICE_REPLY_A autolearn=disabled version=3.4.6 X-Spam-Score: -2.3 Received: from [192.168.1.59] (h-62-63-215-114.A163.priv.bahnhof.se [62.63.215.114]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mail.lysator.liu.se (Postfix) with ESMTPSA id 95CA01F919; Sat, 9 Sep 2023 08:45:17 +0200 (CEST) Message-ID: <4d00a500-a054-b11b-6135-49e4ef7965f2@lysator.liu.se> Date: Sat, 9 Sep 2023 08:45:17 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.15.0 Subject: Re: [RFC] random: use per lcore state Content-Language: en-US To: Konstantin Ananyev , Stephen Hemminger , dev@dpdk.org Cc: =?UTF-8?Q?Mattias_R=c3=b6nnblom?= , =?UTF-8?Q?Morten_Br=c3=b8rup?= References: <20230906172013.169846-1-stephen@networkplumber.org> <741c94a4-65f9-c044-b3fc-3b8c28922d48@yandex.ru> From: =?UTF-8?Q?Mattias_R=c3=b6nnblom?= In-Reply-To: <741c94a4-65f9-c044-b3fc-3b8c28922d48@yandex.ru> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Virus-Scanned: ClamAV using ClamSMTP X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On 2023-09-09 02:13, Konstantin Ananyev wrote: > 06/09/2023 21:02, Mattias Rönnblom пишет: >> On 2023-09-06 19:20, Stephen Hemminger wrote: >>> Move the random number state into thread local storage. >> >> Me and Morten discussed TLS versus other alternatives in some other >> thread. The downside of TLS that Morten pointed out, from what I >> recall, is that lazy initialization is *required* (since the number of >> threads is open-ended), and the data ends up in non-huge page memory. > > Hmm.. correct me if I am wrong, but with current implementation, > rand state is also in non-huge memory: > static struct rte_rand_state rand_states[RTE_MAX_LCORE + 1]; > Yes. The current pattern is certainly not perfect. > >> It was also unclear to me what the memory footprint implications would >> be,h would large per-lcore data structures be put in TLS. More >> specifically, if they would be duplicated across all threads, even >> non-lcore threads. >> >> None of these issues affect rte_random.c's potential usage of TLS >> (except lazy [re-]initialization makes things more complicated). >> >> Preferably, there should be one pattern that is usable across all or >> at least most DPDK modules requiring per-lcore state. >> >>> This has a several benefits. >>>   - no false cache sharing from cpu prefetching >>>   - fixes initialization of random state for non-DPDK threads >> >> This seems like a non-reason to me. That bug is easily fixed, if it >> isn't already. >> >>>   - fixes unsafe usage of random state by non-DPDK threads >>> >> >> "Makes random number generation MT safe from all threads (including >> unregistered non-EAL threads)." >> >> With current API semantics you may still register an non-EAL thread, >> to get MT safe access to this API, so I guess it's more about being >> more convenient and less error prone, than anything else. > > I understand that we never guaranteed MT safety for non-EAL threads here, Registered non-EAL threads have a lcore id and thus may safely call rte_rand(). Multiple unregistered non-EAL threads may not do so, in parallel. > but as a user of rte_rand() - it would be much more convenient, if I can > use it > from any thread wthout worring is it a EAL thread or not. Sure, especially if it comes for free. The for-free solution has yet to reveal itself though. > > About TlS usage and re-seeding - can we use some sort of middle-ground: > extend rte_rand_state with some gen-counter. > Make a 'master' copy of rte_rand_state that will be updated by rte_srand(), > and TLS copies of rte_rand_state, so rte_rand() can fist compare > its gen-counter value with master copy to decide, > does it need to copy new state from master or not. > Calling threads shouldn't all produce the same sequence. That would be silly and not very random. The generation number should be tied to the seed. > >> The new MT safety guarantees should be in the API docs as well. > > Yes, it is an extension to the current API, not a fix. > >> >>> The initialization of random number state is done by the >>> lcore (lazy initialization). >>> >>> Signed-off-by: Stephen Hemminger >>> --- >>>   lib/eal/common/rte_random.c | 38 +++++++++++++++++++------------------ >>>   1 file changed, 20 insertions(+), 18 deletions(-) >>> >>> diff --git a/lib/eal/common/rte_random.c b/lib/eal/common/rte_random.c >>> index 53636331a27b..9657adf6ad3b 100644 >>> --- a/lib/eal/common/rte_random.c >>> +++ b/lib/eal/common/rte_random.c >>> @@ -19,13 +19,14 @@ struct rte_rand_state { >>>       uint64_t z3; >>>       uint64_t z4; >>>       uint64_t z5; >>> -} __rte_cache_aligned; >>> +    uint64_t seed; >>> +}; >>> -/* One instance each for every lcore id-equipped thread, and one >>> - * additional instance to be shared by all others threads (i.e., all >>> - * unregistered non-EAL threads). >>> - */ >>> -static struct rte_rand_state rand_states[RTE_MAX_LCORE + 1]; >>> +/* Global random seed */ >>> +static uint64_t rte_rand_seed; >>> + >>> +/* Per lcore random state. */ >>> +static RTE_DEFINE_PER_LCORE(struct rte_rand_state, rte_rand_state); >>>   static uint32_t >>>   __rte_rand_lcg32(uint32_t *seed) >>> @@ -81,11 +82,7 @@ __rte_srand_lfsr258(uint64_t seed, struct >>> rte_rand_state *state) >>>   void >>>   rte_srand(uint64_t seed) >>>   { >>> -    unsigned int lcore_id; >>> - >>> -    /* add lcore_id to seed to avoid having the same sequence */ >>> -    for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) >>> -        __rte_srand_lfsr258(seed + lcore_id, &rand_states[lcore_id]); >>> +    __atomic_store_n(&rte_rand_seed, seed, __ATOMIC_RELAXED); >>>   } >>>   static __rte_always_inline uint64_t >>> @@ -119,15 +116,18 @@ __rte_rand_lfsr258(struct rte_rand_state *state) >>>   static __rte_always_inline >>>   struct rte_rand_state *__rte_rand_get_state(void) >>>   { >>> -    unsigned int idx; >>> +    struct rte_rand_state *rand_state = &RTE_PER_LCORE(rte_rand_state); >> >> There should really be a RTE_PER_THREAD, an alias to RTE_PER_LCORE, to >> cover this usage. Or just use __thread (or _Thread_local?). >> >>> +    uint64_t seed; >>> -    idx = rte_lcore_id(); >>> +    seed = __atomic_load_n(&rte_rand_seed, __ATOMIC_RELAXED); >>> +    if (unlikely(seed != rand_state->seed)) { >>> +        rand_state->seed = seed; >> >> Re-seeding should restart the series, on all lcores. There's nothing >> preventing the user from re-seeding the machinery repeatedly, with the >> same seed. Seems like an unusual, but still valid, use case, if you >> run repeated tests of some sort. >> >> Use a seqlock? :) I guess you need a seed generation number as well >> (e.g., is this the first time you seed with X, or the second one, etc.) >> >>> -    /* last instance reserved for unregistered non-EAL threads */ >>> -    if (unlikely(idx == LCORE_ID_ANY)) >>> -        idx = RTE_MAX_LCORE; >>> +        seed += rte_thread_self().opaque_id; >>> +        __rte_srand_lfsr258(seed, rand_state); >>> +    } >>> -    return &rand_states[idx]; >>> +    return rand_state; >>>   } >>>   uint64_t >>> @@ -227,7 +227,9 @@ RTE_INIT(rte_rand_init) >>>   { >>>       uint64_t seed; >>> -    seed = __rte_random_initial_seed(); >>> +    do >>> +        seed = __rte_random_initial_seed(); >>> +    while (seed == 0); >> >> Might be worth a comment why seed 0 is not allowed. Alternatively, use >> some other way of signaling __rte_srand_lfsr258() must be called. >> >>>       rte_srand(seed); >>>   } >