From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 1903E4254E; Sat, 9 Sep 2023 02:13:43 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 93B0940295; Sat, 9 Sep 2023 02:13:42 +0200 (CEST) Received: from forward500b.mail.yandex.net (forward500b.mail.yandex.net [178.154.239.144]) by mails.dpdk.org (Postfix) with ESMTP id 638CC4027F for ; Sat, 9 Sep 2023 02:13:41 +0200 (CEST) Received: from mail-nwsmtp-smtp-production-main-23.myt.yp-c.yandex.net (mail-nwsmtp-smtp-production-main-23.myt.yp-c.yandex.net [IPv6:2a02:6b8:c12:3a91:0:640:b9f:0]) by forward500b.mail.yandex.net (Yandex) with ESMTP id 90DD95EE52; Sat, 9 Sep 2023 03:13:40 +0300 (MSK) Received: by mail-nwsmtp-smtp-production-main-23.myt.yp-c.yandex.net (smtp/Yandex) with ESMTPSA id bDOPBiFDe4Y0-R619WkUL; Sat, 09 Sep 2023 03:13:39 +0300 X-Yandex-Fwd: 1 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.ru; s=mail; t=1694218419; bh=DByXnO6qTw/0bsNn/yV78HTLwguVP6N50usu1V+szV8=; h=From:In-Reply-To:Cc:Date:References:To:Subject:Message-ID; b=W/4wRpTdgywxu1VcsgaQNAQca4DJEnv3y7g+ziqaER0P+O7qFGTEAywGwhMWtlpRi Ouk6zrBzG+9/kB/2LZmS8DhV/JOZ6kw6VywsXzyaxh6jMoEas/oaggu4KHpkA/425E KNygLnQpl/6jm6c99LAk4aPKuBFd/XpGO198fBNo= Authentication-Results: mail-nwsmtp-smtp-production-main-23.myt.yp-c.yandex.net; dkim=pass header.i=@yandex.ru Message-ID: <741c94a4-65f9-c044-b3fc-3b8c28922d48@yandex.ru> Date: Sat, 9 Sep 2023 01:13:37 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.15.0 Subject: Re: [RFC] random: use per lcore state Content-Language: en-US To: =?UTF-8?Q?Mattias_R=c3=b6nnblom?= , Stephen Hemminger , dev@dpdk.org Cc: =?UTF-8?Q?Mattias_R=c3=b6nnblom?= , =?UTF-8?Q?Morten_Br=c3=b8rup?= References: <20230906172013.169846-1-stephen@networkplumber.org> From: Konstantin Ananyev In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org 06/09/2023 21:02, Mattias Rönnblom пишет: > On 2023-09-06 19:20, Stephen Hemminger wrote: >> Move the random number state into thread local storage. > > Me and Morten discussed TLS versus other alternatives in some other > thread. The downside of TLS that Morten pointed out, from what I recall, > is that lazy initialization is *required* (since the number of threads > is open-ended), and the data ends up in non-huge page memory. Hmm.. correct me if I am wrong, but with current implementation, rand state is also in non-huge memory: static struct rte_rand_state rand_states[RTE_MAX_LCORE + 1]; > It was > also unclear to me what the memory footprint implications would be,h > would large per-lcore data structures be put in TLS. More specifically, > if they would be duplicated across all threads, even non-lcore threads. > > None of these issues affect rte_random.c's potential usage of TLS > (except lazy [re-]initialization makes things more complicated). > > Preferably, there should be one pattern that is usable across all or at > least most DPDK modules requiring per-lcore state. > >> This has a several benefits. >>   - no false cache sharing from cpu prefetching >>   - fixes initialization of random state for non-DPDK threads > > This seems like a non-reason to me. That bug is easily fixed, if it > isn't already. > >>   - fixes unsafe usage of random state by non-DPDK threads >> > > "Makes random number generation MT safe from all threads (including > unregistered non-EAL threads)." > > With current API semantics you may still register an non-EAL thread, to > get MT safe access to this API, so I guess it's more about being more > convenient and less error prone, than anything else. I understand that we never guaranteed MT safety for non-EAL threads here, but as a user of rte_rand() - it would be much more convenient, if I can use it from any thread wthout worring is it a EAL thread or not. About TlS usage and re-seeding - can we use some sort of middle-ground: extend rte_rand_state with some gen-counter. Make a 'master' copy of rte_rand_state that will be updated by rte_srand(), and TLS copies of rte_rand_state, so rte_rand() can fist compare its gen-counter value with master copy to decide, does it need to copy new state from master or not. > The new MT safety guarantees should be in the API docs as well. Yes, it is an extension to the current API, not a fix. > >> The initialization of random number state is done by the >> lcore (lazy initialization). >> >> Signed-off-by: Stephen Hemminger >> --- >>   lib/eal/common/rte_random.c | 38 +++++++++++++++++++------------------ >>   1 file changed, 20 insertions(+), 18 deletions(-) >> >> diff --git a/lib/eal/common/rte_random.c b/lib/eal/common/rte_random.c >> index 53636331a27b..9657adf6ad3b 100644 >> --- a/lib/eal/common/rte_random.c >> +++ b/lib/eal/common/rte_random.c >> @@ -19,13 +19,14 @@ struct rte_rand_state { >>       uint64_t z3; >>       uint64_t z4; >>       uint64_t z5; >> -} __rte_cache_aligned; >> +    uint64_t seed; >> +}; >> -/* One instance each for every lcore id-equipped thread, and one >> - * additional instance to be shared by all others threads (i.e., all >> - * unregistered non-EAL threads). >> - */ >> -static struct rte_rand_state rand_states[RTE_MAX_LCORE + 1]; >> +/* Global random seed */ >> +static uint64_t rte_rand_seed; >> + >> +/* Per lcore random state. */ >> +static RTE_DEFINE_PER_LCORE(struct rte_rand_state, rte_rand_state); >>   static uint32_t >>   __rte_rand_lcg32(uint32_t *seed) >> @@ -81,11 +82,7 @@ __rte_srand_lfsr258(uint64_t seed, struct >> rte_rand_state *state) >>   void >>   rte_srand(uint64_t seed) >>   { >> -    unsigned int lcore_id; >> - >> -    /* add lcore_id to seed to avoid having the same sequence */ >> -    for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) >> -        __rte_srand_lfsr258(seed + lcore_id, &rand_states[lcore_id]); >> +    __atomic_store_n(&rte_rand_seed, seed, __ATOMIC_RELAXED); >>   } >>   static __rte_always_inline uint64_t >> @@ -119,15 +116,18 @@ __rte_rand_lfsr258(struct rte_rand_state *state) >>   static __rte_always_inline >>   struct rte_rand_state *__rte_rand_get_state(void) >>   { >> -    unsigned int idx; >> +    struct rte_rand_state *rand_state = &RTE_PER_LCORE(rte_rand_state); > > There should really be a RTE_PER_THREAD, an alias to RTE_PER_LCORE, to > cover this usage. Or just use __thread (or _Thread_local?). > >> +    uint64_t seed; >> -    idx = rte_lcore_id(); >> +    seed = __atomic_load_n(&rte_rand_seed, __ATOMIC_RELAXED); >> +    if (unlikely(seed != rand_state->seed)) { >> +        rand_state->seed = seed; > > Re-seeding should restart the series, on all lcores. There's nothing > preventing the user from re-seeding the machinery repeatedly, with the > same seed. Seems like an unusual, but still valid, use case, if you run > repeated tests of some sort. > > Use a seqlock? :) I guess you need a seed generation number as well > (e.g., is this the first time you seed with X, or the second one, etc.) > >> -    /* last instance reserved for unregistered non-EAL threads */ >> -    if (unlikely(idx == LCORE_ID_ANY)) >> -        idx = RTE_MAX_LCORE; >> +        seed += rte_thread_self().opaque_id; >> +        __rte_srand_lfsr258(seed, rand_state); >> +    } >> -    return &rand_states[idx]; >> +    return rand_state; >>   } >>   uint64_t >> @@ -227,7 +227,9 @@ RTE_INIT(rte_rand_init) >>   { >>       uint64_t seed; >> -    seed = __rte_random_initial_seed(); >> +    do >> +        seed = __rte_random_initial_seed(); >> +    while (seed == 0); > > Might be worth a comment why seed 0 is not allowed. Alternatively, use > some other way of signaling __rte_srand_lfsr258() must be called. > >>       rte_srand(seed); >>   }