From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id EDC0E43F39; Sun, 28 Apr 2024 17:37:44 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 74E2340265; Sun, 28 Apr 2024 17:37:44 +0200 (CEST) Received: from mail.lysator.liu.se (mail.lysator.liu.se [130.236.254.3]) by mails.dpdk.org (Postfix) with ESMTP id DED0F40263; Sun, 28 Apr 2024 17:37:43 +0200 (CEST) Received: from mail.lysator.liu.se (localhost [127.0.0.1]) by mail.lysator.liu.se (Postfix) with ESMTP id 6DF561E2B8; Sun, 28 Apr 2024 17:37:43 +0200 (CEST) Received: by mail.lysator.liu.se (Postfix, from userid 1004) id 617401E279; Sun, 28 Apr 2024 17:37:43 +0200 (CEST) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on hermod.lysator.liu.se X-Spam-Level: X-Spam-Status: No, score=-1.3 required=5.0 tests=ALL_TRUSTED,AWL, T_SCC_BODY_TEXT_LINE autolearn=disabled version=4.0.0 X-Spam-Score: -1.3 Received: from [192.168.1.59] (h-62-63-215-114.A163.priv.bahnhof.se [62.63.215.114]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mail.lysator.liu.se (Postfix) with ESMTPSA id AFB791E278; Sun, 28 Apr 2024 17:37:41 +0200 (CEST) Message-ID: <93464b2a-f454-4b4f-ba10-94701ee4ef72@lysator.liu.se> Date: Sun, 28 Apr 2024 17:37:41 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC v2 5/6] eal: add atomic bit operations To: =?UTF-8?Q?Morten_Br=C3=B8rup?= , =?UTF-8?Q?Mattias_R=C3=B6nnblom?= , dev@dpdk.org, Tyler Retzlaff Cc: Heng Wang , Stephen Hemminger , techboard@dpdk.org References: <20240302135328.531940-2-mattias.ronnblom@ericsson.com> <20240425085853.97888-1-mattias.ronnblom@ericsson.com> <20240425085853.97888-6-mattias.ronnblom@ericsson.com> <98CBD80474FA8B44BF855DF32C47DC35E9F3E8@smartserver.smartshare.dk> <98CBD80474FA8B44BF855DF32C47DC35E9F3EF@smartserver.smartshare.dk> <7e469926-0c09-42a4-aa8f-8cde0578690b@lysator.liu.se> <98CBD80474FA8B44BF855DF32C47DC35E9F3F6@smartserver.smartshare.dk> Content-Language: en-US From: =?UTF-8?Q?Mattias_R=C3=B6nnblom?= In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC35E9F3F6@smartserver.smartshare.dk> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Virus-Scanned: ClamAV using ClamSMTP X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On 2024-04-26 14:00, Morten Brørup wrote: >> From: Mattias Rönnblom [mailto:hofors@lysator.liu.se] >> Sent: Friday, 26 April 2024 11.39 >> >> On 2024-04-25 18:18, Morten Brørup wrote: >>>> From: Mattias Rönnblom [mailto:hofors@lysator.liu.se] >>>> Sent: Thursday, 25 April 2024 16.36 >>>> >>>> On 2024-04-25 12:25, Morten Brørup wrote: >>>>>> +#define rte_bit_atomic_test(addr, nr, memory_order) >>>> \ >>>>>> + _Generic((addr), \ >>>>>> + uint32_t *: __rte_bit_atomic_test32, \ >>>>>> + uint64_t *: __rte_bit_atomic_test64)(addr, nr, >>>> memory_order) >>>>> >>>>> I wonder if these should have RTE_ATOMIC qualifier: >>>>> >>>>> + RTE_ATOMIC(uint32_t) *: __rte_bit_atomic_test32, >>>> \ >>>>> + RTE_ATOMIC(uint64_t) *: __rte_bit_atomic_test64)(addr, nr, >>>> memory_order) >>>>> >>>>> >>>>>> +#define __RTE_GEN_BIT_ATOMIC_TEST(size) >>>> \ >>>>>> + static inline bool \ >>>>>> + __rte_bit_atomic_test ## size(const uint ## size ## _t *addr, >>>> \ >>>>> >>>>> I wonder if the "addr" parameter should have RTE_ATOMIC qualifier: >>>>> >>>>> + __rte_bit_atomic_test ## size(const RTE_ATOMIC(uint ## size ## _t) >>>> *addr, \ >>>>> >>>>> instead of casting into a_addr. >>>>> >>>> >>>> Check the cover letter for the rationale for the cast. >>> >>> Thanks, that clarifies it. Then... >>> For the series: >>> Acked-by: Morten Brørup >>> >>>> >>>> Where I'm at now is that I think C11 _Atomic is rather poor design. The >>>> assumption that an object which allows for atomic access always should >>>> require all operations upon it to be atomic, regardless of where it is >>>> in its lifetime, and which thread is accessing it, does not hold, in the >>>> general case. >>> >>> It might be slow, but I suppose the C11 standard prioritizes correctness >> over performance. >>> >> >> That's a false dichotomy, in this case. You can have both. > > In theory you shouldn't need non-atomic access to atomic variables. > In reality, we want it anyway, because real CPUs are faster at non-atomic operations. > >> >>> It seems locks are automatically added to _Atomic types larger than what is >> natively supported by the architecture. >>> E.g. MSVC adds locks to _Atomic types larger than 8 byte. [1] >>> >>> [1]: https://devblogs.microsoft.com/cppblog/c11-atomics-in-visual-studio- >> 2022-version-17-5-preview-2/ >>> >>>> >>>> The only reason for _Atomic being as it is, as far as I can see, is to >>>> accommodate for ISAs which does not have the appropriate atomic machine >>>> instructions, and thus require a lock or some other data associated with >>>> the actual user-data-carrying bits. Neither GCC nor DPDK supports any >>>> such ISAs, to my knowledge. I suspect neither never will. So the cast >>>> will continue to work. >>> >>> I tend to agree with you on this. >>> >>> We should officially decide that DPDK treats RTE_ATOMIC types as a union of >> _Atomic and non-atomic, i.e. operations on RTE_ATOMIC types can be both atomic >> and non-atomic. >>> >> >> I think this is a subject which needs to be further explored. > > Yes. It's easier exploring and deciding now, when our options are open, than after we have locked down the affected APIs. > >> >> Objects that can be accessed both atomically and non-atomically should >> be without _Atomic. With my current understanding of this issue, that >> seems like the best option. > > Agree. > > The alterative described below is certainly no good! > > It would be nice if they were marked as sometimes-atomic by a qualifier or special type, like rte_be32_t marks the network byte order variant of an uint32_t. > > Furthermore, large atomic objects need the _Atomic qualifier for the compiler to add (and use) the associated lock. If you have larger objects than the ISA can handle, you wouldn't want to leave the choice of the synchronization primitive to use to the compiler. I don't see how it could possibly know, which one is the most appropriate, especially in a DPDK context. It would for example need to know if the contending threads are non-preemptable or not. In some situations a sequence lock may well be your best option. Will the compiler generate one for you? If "lock" means std::mutex, it would be a disaster, performance-wise. > Alternatively, we could specify that sometimes-atomic objects cannot be larger than 8 byte, which is what MSVC can handle without locking. > >> >> You could turn it around as well, and have such marked _Atomic and have >> explicit casts to their non-_Atomic cousins when operated upon by >> non-atomic functions. Not sure how realistic that is, since >> non-atomicity is the norm. All generic selection-based "functions" must >> take this into account. >> >>>> >>>>>> + unsigned int nr, int memory_order) \ >>>>>> + { \ >>>>>> + RTE_ASSERT(nr < size); \ >>>>>> + \ >>>>>> + const RTE_ATOMIC(uint ## size ## _t) *a_addr = \ >>>>>> + (const RTE_ATOMIC(uint ## size ## _t) *)addr; \ >>>>>> + uint ## size ## _t mask = (uint ## size ## _t)1 << nr; \ >>>>>> + return rte_atomic_load_explicit(a_addr, memory_order) & >>>> mask; \ >>>>>> + } >>>>> >>>>> >>>>> Similar considerations regarding volatile qualifier for the "once" >>>> operations. >>>>>