From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id E9EB54320A; Thu, 26 Oct 2023 18:36:56 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id C1C5840A8A; Thu, 26 Oct 2023 18:36:56 +0200 (CEST) Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by mails.dpdk.org (Postfix) with ESMTP id A16E340A80 for ; Thu, 26 Oct 2023 18:36:54 +0200 (CEST) Received: by linux.microsoft.com (Postfix, from userid 1086) id E213520B74C0; Thu, 26 Oct 2023 09:36:53 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com E213520B74C0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1698338213; bh=oO1k+PNmBp/vnKiaQcvYjzSslTJ9MdPpcHN5dwh3khc=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=anIGcnE18ZHec26fr2KknGIHQzlZF0+xGgtvu3OIc7xrjsXdUJUAt6Mp6QYTZTiyH XD1g/MINIs/+buk/wGYBx37soXtAW5+hocR28eBfg0+/4lXrdqmaKqACovjdyxj9f9 xPTqGvL4PY42mUX/00UaCBB5TDXU1yPW4ruGaRV8= Date: Thu, 26 Oct 2023 09:36:53 -0700 From: Tyler Retzlaff To: Ruifeng Wang Cc: "dev@dpdk.org" , Akhil Goyal , Anatoly Burakov , Andrew Rybchenko , Bruce Richardson , Chenbo Xia , Ciara Power , David Christensen , David Hunt , Dmitry Kozlyuk , Dmitry Malloy , Elena Agostini , Erik Gabriel Carrillo , Fan Zhang , Ferruh Yigit , Harman Kalra , Harry van Haaren , Honnappa Nagarahalli , "jerinj@marvell.com" , Konstantin Ananyev , Matan Azrad , Maxime Coquelin , Narcisa Ana Maria Vasile , Nicolas Chautru , Olivier Matz , Ori Kam , Pallavi Kadam , Pavan Nikhilesh , Reshma Pattan , Sameh Gobriel , Shijith Thotton , Sivaprasad Tummala , Stephen Hemminger , Suanming Mou , Sunil Kumar Kori , "thomas@monjalon.net" , Viacheslav Ovsiienko , Vladimir Medvedkin , Yipeng Wang , nd Subject: Re: [PATCH v2 09/19] rcu: use rte optional stdatomic API Message-ID: <20231026163653.GA21677@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> References: <1697497745-20664-1-git-send-email-roretzla@linux.microsoft.com> <1697574677-16578-1-git-send-email-roretzla@linux.microsoft.com> <1697574677-16578-10-git-send-email-roretzla@linux.microsoft.com> <20231025223814.GA30459@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Thu, Oct 26, 2023 at 04:24:54AM +0000, Ruifeng Wang wrote: > > -----Original Message----- > > From: Tyler Retzlaff > > Sent: Thursday, October 26, 2023 6:38 AM > > To: Ruifeng Wang > > Cc: dev@dpdk.org; Akhil Goyal ; Anatoly Burakov > > ; Andrew Rybchenko ; Bruce > > Richardson ; Chenbo Xia ; Ciara Power > > ; David Christensen ; David Hunt > > ; Dmitry Kozlyuk ; Dmitry Malloy > > ; Elena Agostini ; Erik Gabriel Carrillo > > ; Fan Zhang ; Ferruh Yigit > > ; Harman Kalra ; Harry van Haaren > > ; Honnappa Nagarahalli ; > > jerinj@marvell.com; Konstantin Ananyev ; Matan Azrad > > ; Maxime Coquelin ; Narcisa Ana Maria Vasile > > ; Nicolas Chautru ; Olivier Matz > > ; Ori Kam ; Pallavi Kadam > > ; Pavan Nikhilesh ; Reshma Pattan > > ; Sameh Gobriel ; Shijith Thotton > > ; Sivaprasad Tummala ; Stephen Hemminger > > ; Suanming Mou ; Sunil Kumar Kori > > ; thomas@monjalon.net; Viacheslav Ovsiienko ; > > Vladimir Medvedkin ; Yipeng Wang ; > > nd > > Subject: Re: [PATCH v2 09/19] rcu: use rte optional stdatomic API > > > > On Wed, Oct 25, 2023 at 09:41:22AM +0000, Ruifeng Wang wrote: > > > > -----Original Message----- > > > > From: Tyler Retzlaff > > > > Sent: Wednesday, October 18, 2023 4:31 AM > > > > To: dev@dpdk.org > > > > Cc: Akhil Goyal ; Anatoly Burakov > > > > ; Andrew Rybchenko > > > > ; Bruce Richardson > > > > ; Chenbo Xia ; > > > > Ciara Power ; David Christensen > > > > ; David Hunt ; Dmitry > > > > Kozlyuk ; Dmitry Malloy > > > > ; Elena Agostini ; Erik > > > > Gabriel Carrillo ; Fan Zhang > > > > ; Ferruh Yigit ; > > > > Harman Kalra ; Harry van Haaren > > > > ; Honnappa Nagarahalli > > > > ; jerinj@marvell.com; Konstantin > > > > Ananyev ; Matan Azrad > > > > ; Maxime Coquelin ; > > > > Narcisa Ana Maria Vasile ; Nicolas > > > > Chautru ; Olivier Matz > > > > ; Ori Kam ; Pallavi Kadam > > > > ; Pavan Nikhilesh > > > > ; Reshma Pattan ; > > > > Sameh Gobriel ; Shijith Thotton > > > > ; Sivaprasad Tummala > > > > ; Stephen Hemminger > > > > ; Suanming Mou ; > > > > Sunil Kumar Kori ; thomas@monjalon.net; > > > > Viacheslav Ovsiienko ; Vladimir Medvedkin > > > > ; Yipeng Wang > > > > ; Tyler Retzlaff > > > > > > > > Subject: [PATCH v2 09/19] rcu: use rte optional stdatomic API > > > > > > > > Replace the use of gcc builtin __atomic_xxx intrinsics with > > > > corresponding rte_atomic_xxx optional stdatomic API > > > > > > > > Signed-off-by: Tyler Retzlaff > > > > --- > > > > lib/rcu/rte_rcu_qsbr.c | 48 +++++++++++++++++------------------ > > > > lib/rcu/rte_rcu_qsbr.h | 68 > > > > +++++++++++++++++++++++++------------------------- > > > > 2 files changed, 58 insertions(+), 58 deletions(-) > > > > > > > > diff --git a/lib/rcu/rte_rcu_qsbr.c b/lib/rcu/rte_rcu_qsbr.c index > > > > 17be93e..4dc7714 100644 > > > > --- a/lib/rcu/rte_rcu_qsbr.c > > > > +++ b/lib/rcu/rte_rcu_qsbr.c > > > > @@ -102,21 +102,21 @@ > > > > * go out of sync. Hence, additional checks are required. > > > > */ > > > > /* Check if the thread is already registered */ > > > > - old_bmap = __atomic_load_n(__RTE_QSBR_THRID_ARRAY_ELM(v, i), > > > > - __ATOMIC_RELAXED); > > > > + old_bmap = rte_atomic_load_explicit(__RTE_QSBR_THRID_ARRAY_ELM(v, i), > > > > + rte_memory_order_relaxed); > > > > if (old_bmap & 1UL << id) > > > > return 0; > > > > > > > > do { > > > > new_bmap = old_bmap | (1UL << id); > > > > - success = __atomic_compare_exchange( > > > > + success = rte_atomic_compare_exchange_strong_explicit( > > > > __RTE_QSBR_THRID_ARRAY_ELM(v, i), > > > > - &old_bmap, &new_bmap, 0, > > > > - __ATOMIC_RELEASE, __ATOMIC_RELAXED); > > > > + &old_bmap, new_bmap, > > > > + rte_memory_order_release, rte_memory_order_relaxed); > > > > > > > > if (success) > > > > - __atomic_fetch_add(&v->num_threads, > > > > - 1, __ATOMIC_RELAXED); > > > > + rte_atomic_fetch_add_explicit(&v->num_threads, > > > > + 1, rte_memory_order_relaxed); > > > > else if (old_bmap & (1UL << id)) > > > > /* Someone else registered this thread. > > > > * Counter should not be incremented. > > > > @@ -154,8 +154,8 @@ > > > > * go out of sync. Hence, additional checks are required. > > > > */ > > > > /* Check if the thread is already unregistered */ > > > > - old_bmap = __atomic_load_n(__RTE_QSBR_THRID_ARRAY_ELM(v, i), > > > > - __ATOMIC_RELAXED); > > > > + old_bmap = rte_atomic_load_explicit(__RTE_QSBR_THRID_ARRAY_ELM(v, i), > > > > + rte_memory_order_relaxed); > > > > if (!(old_bmap & (1UL << id))) > > > > return 0; > > > > > > > > @@ -165,14 +165,14 @@ > > > > * completed before removal of the thread from the list of > > > > * reporting threads. > > > > */ > > > > - success = __atomic_compare_exchange( > > > > + success = rte_atomic_compare_exchange_strong_explicit( > > > > __RTE_QSBR_THRID_ARRAY_ELM(v, i), > > > > - &old_bmap, &new_bmap, 0, > > > > - __ATOMIC_RELEASE, __ATOMIC_RELAXED); > > > > + &old_bmap, new_bmap, > > > > + rte_memory_order_release, rte_memory_order_relaxed); > > > > > > > > if (success) > > > > - __atomic_fetch_sub(&v->num_threads, > > > > - 1, __ATOMIC_RELAXED); > > > > + rte_atomic_fetch_sub_explicit(&v->num_threads, > > > > + 1, rte_memory_order_relaxed); > > > > else if (!(old_bmap & (1UL << id))) > > > > /* Someone else unregistered this thread. > > > > * Counter should not be incremented. > > > > @@ -227,8 +227,8 @@ > > > > > > > > fprintf(f, " Registered thread IDs = "); > > > > for (i = 0; i < v->num_elems; i++) { > > > > - bmap = __atomic_load_n(__RTE_QSBR_THRID_ARRAY_ELM(v, i), > > > > - __ATOMIC_ACQUIRE); > > > > + bmap = rte_atomic_load_explicit(__RTE_QSBR_THRID_ARRAY_ELM(v, i), > > > > + rte_memory_order_acquire); > > > > id = i << __RTE_QSBR_THRID_INDEX_SHIFT; > > > > while (bmap) { > > > > t = __builtin_ctzl(bmap); > > > > @@ -241,26 +241,26 @@ > > > > fprintf(f, "\n"); > > > > > > > > fprintf(f, " Token = %" PRIu64 "\n", > > > > - __atomic_load_n(&v->token, __ATOMIC_ACQUIRE)); > > > > + rte_atomic_load_explicit(&v->token, rte_memory_order_acquire)); > > > > > > > > fprintf(f, " Least Acknowledged Token = %" PRIu64 "\n", > > > > - __atomic_load_n(&v->acked_token, __ATOMIC_ACQUIRE)); > > > > + rte_atomic_load_explicit(&v->acked_token, > > > > +rte_memory_order_acquire)); > > > > > > > > fprintf(f, "Quiescent State Counts for readers:\n"); > > > > for (i = 0; i < v->num_elems; i++) { > > > > - bmap = __atomic_load_n(__RTE_QSBR_THRID_ARRAY_ELM(v, i), > > > > - __ATOMIC_ACQUIRE); > > > > + bmap = rte_atomic_load_explicit(__RTE_QSBR_THRID_ARRAY_ELM(v, i), > > > > + rte_memory_order_acquire); > > > > id = i << __RTE_QSBR_THRID_INDEX_SHIFT; > > > > while (bmap) { > > > > t = __builtin_ctzl(bmap); > > > > fprintf(f, "thread ID = %u, count = %" PRIu64 ", lock count = %u\n", > > > > id + t, > > > > - __atomic_load_n( > > > > + rte_atomic_load_explicit( > > > > &v->qsbr_cnt[id + t].cnt, > > > > - __ATOMIC_RELAXED), > > > > - __atomic_load_n( > > > > + rte_memory_order_relaxed), > > > > + rte_atomic_load_explicit( > > > > &v->qsbr_cnt[id + t].lock_cnt, > > > > - __ATOMIC_RELAXED)); > > > > + rte_memory_order_relaxed)); > > > > bmap &= ~(1UL << t); > > > > } > > > > } > > > > diff --git a/lib/rcu/rte_rcu_qsbr.h b/lib/rcu/rte_rcu_qsbr.h index > > > > 87e1b55..9f4aed2 100644 > > > > --- a/lib/rcu/rte_rcu_qsbr.h > > > > +++ b/lib/rcu/rte_rcu_qsbr.h > > > > @@ -63,11 +63,11 @@ > > > > * Given thread id needs to be converted to index into the array and > > > > * the id within the array element. > > > > */ > > > > -#define __RTE_QSBR_THRID_ARRAY_ELM_SIZE (sizeof(uint64_t) * 8) > > > > +#define __RTE_QSBR_THRID_ARRAY_ELM_SIZE > > > > +(sizeof(RTE_ATOMIC(uint64_t)) * > > > > +8) > > > > #define __RTE_QSBR_THRID_ARRAY_SIZE(max_threads) \ > > > > RTE_ALIGN(RTE_ALIGN_MUL_CEIL(max_threads, \ > > > > __RTE_QSBR_THRID_ARRAY_ELM_SIZE) >> 3, RTE_CACHE_LINE_SIZE) > > > > -#define __RTE_QSBR_THRID_ARRAY_ELM(v, i) ((uint64_t *) \ > > > > +#define __RTE_QSBR_THRID_ARRAY_ELM(v, i) ((uint64_t __rte_atomic *) > > > > +\ > > > > > > Is it equivalent to ((RTE_ATOMIC(uint64_t) *)? > > > > i'm not sure if you're asking about the resultant type of the expression or not? > > I see other places are using specifier hence the question. > > > > > in this context we aren't specifying an atomic type but rather adding the atomic qualifier > > to what should already be a variable that has an atomic specified type with a cast which > > is why we use __rte_atomic. > > I read from document [1] that atomic qualified type may have a different size from the original type. > If that is the case, the size difference could cause issue in bitmap array accessing. > Did I misunderstand? > > [1] https://en.cppreference.com/w/c/language/atomic > you do not misunderstand, the standard allows atomic specified type sizes to differ from their ordinary native type sizes. though i have some issues with how cppreference is wording things here as compared with the actual standard. one of the reasons is it allows all standard atomic functions to be 'generic' which means they can be used on objects of arbitrary size instead of just integer and pointer types. i.e. you can use it on struct, union and array types. it's implementation defined how the operations are made atomic and is obviously target processor dependent, but in cases when the processor has no intrinsic support to perform the operation atomically the toolchain may generate the code that uses a hidden lock. you can test whether this is the case for arbitrary objects using standard specified atomic_is_lock_free. https://en.cppreference.com/w/c/atomic/atomic_is_lock_free so that's the long answer form of why they *may* be different size, alignment etc.. but the real question is in this instance will it be? probably not. mainly because it wouldn't make a lot of sense for clang/gcc to suddenly decide that sizeof(uint64_t) != sizeof(_Atomic(uint64_t)) or that they should need to use a lock on amd64 processor to load/store atomically (assuming native alignment) etc.. a lot of the above is why we had a lot of discussion about how and when we could enable the use of standard C11 atomics in dpdk. as you've probably noticed for existing platforms, toolchains and targets it is actually defaulted off, but it does allow binary packagers or users to build with it on. for compatibility only the strictest of guarantees can be made when dpdk and the application are both built consistently to use or not use standard atomics. it is strongly cautioned that applications should not attempt to use an unmatched atomic operation on a dpdk atomic object. i.e. if you enabled standard atomics, don't use __atomic_load_n directly on a field from a public dpdk structure, instead use rte_atomic_load_explicit and make sure your application defines RTE_ENABLE_STDATOMIC. hope this explanation helps. > > > > > > > > > ((struct rte_rcu_qsbr_cnt *)(v + 1) + v->max_threads) + i) > > > > #define __RTE_QSBR_THRID_INDEX_SHIFT 6 #define > > > > __RTE_QSBR_THRID_MASK 0x3f @@ -75,13 +75,13 @@ > > > > > > > > > >