From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 47F4743F06; Fri, 26 Apr 2024 13:17:45 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 3521743D75; Fri, 26 Apr 2024 13:17:45 +0200 (CEST) Received: from mail.lysator.liu.se (mail.lysator.liu.se [130.236.254.3]) by mails.dpdk.org (Postfix) with ESMTP id D28EE43CEA for ; Fri, 26 Apr 2024 13:17:43 +0200 (CEST) Received: from mail.lysator.liu.se (localhost [127.0.0.1]) by mail.lysator.liu.se (Postfix) with ESMTP id 314C2138A2 for ; Fri, 26 Apr 2024 13:17:43 +0200 (CEST) Received: by mail.lysator.liu.se (Postfix, from userid 1004) id 13ACD1383C; Fri, 26 Apr 2024 13:17:43 +0200 (CEST) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on hermod.lysator.liu.se X-Spam-Level: X-Spam-Status: No, score=-1.3 required=5.0 tests=ALL_TRUSTED,AWL, T_SCC_BODY_TEXT_LINE autolearn=disabled version=4.0.0 X-Spam-Score: -1.3 Received: from [192.168.1.59] (h-62-63-215-114.A163.priv.bahnhof.se [62.63.215.114]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mail.lysator.liu.se (Postfix) with ESMTPSA id AEB6F137D0; Fri, 26 Apr 2024 13:17:40 +0200 (CEST) Message-ID: <923ae68e-52a4-4f0e-abb4-4940f1a796e0@lysator.liu.se> Date: Fri, 26 Apr 2024 13:17:40 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird From: =?UTF-8?Q?Mattias_R=C3=B6nnblom?= Subject: Re: [RFC v2 0/6] Improve EAL bit operations API To: Tyler Retzlaff , =?UTF-8?Q?Mattias_R=C3=B6nnblom?= Cc: dev@dpdk.org, Heng Wang , Stephen Hemminger References: <20240302135328.531940-2-mattias.ronnblom@ericsson.com> <20240425085853.97888-1-mattias.ronnblom@ericsson.com> <20240425180513.GA23608@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> Content-Language: en-US In-Reply-To: <20240425180513.GA23608@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Virus-Scanned: ClamAV using ClamSMTP X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On 2024-04-25 20:05, Tyler Retzlaff wrote: > On Thu, Apr 25, 2024 at 10:58:47AM +0200, Mattias Rönnblom wrote: >> This patch set represent an attempt to improve and extend the RTE >> bitops API, in particular for functions that operate on individual >> bits. >> >> All new functionality is exposed to the user as generic selection >> macros, delegating the actual work to private (__-marked) static >> inline functions. Public functions (e.g., rte_bit_set32()) would just >> be bloating the API. Such generic selection macros will here be >> referred to as "functions", although technically they are not. > > >> >> The legacy rte_bit_relaxed_*() family of functions is >> replaced with three families: >> >> rte_bit_[test|set|clear|assign]() which provides no memory ordering or >> atomicity guarantees and no read-once or write-once semantics (e.g., >> no use of volatile), but does provide the best performance. The >> performance degradation resulting from the use of volatile (e.g., >> forcing loads and stores to actually occur and in the number >> specified) and atomic (e.g., LOCK-prefixed instructions on x86) may be >> significant. >> >> rte_bit_once_*() which guarantees program-level load and stores >> actually occurring (i.e., prevents certain optimizations). The primary >> use of these functions are in the context of memory mapped >> I/O. Feedback on the details (semantics, naming) here would be greatly >> appreciated, since the author is not much of a driver developer. >> >> rte_bit_atomic_*() which provides atomic bit-level operations, >> including the possibility to specifying memory ordering constraints >> (or the lack thereof). >> >> The atomic functions take non-_Atomic pointers, to be flexible, just >> like the GCC builtins and default . The issue with >> _Atomic APIs is that it may well be the case that the user wants to >> perform both non-atomic and atomic operations on the same word. >> >> Having _Atomic-marked addresses would complicate supporting atomic >> bit-level operations in the bitset API (proposed in a different RFC >> patchset), and potentially other APIs depending on RTE bitops for >> atomic bit-level ops). Either one needs two bitset variants, one >> _Atomic bitset and one non-atomic one, or the bitset code needs to >> cast the non-_Atomic pointer to an _Atomic one. Having a separate >> _Atomic bitset would be bloat and also prevent the user from both, in >> some situations, doing atomic operations against a bit set, while in >> other situations (e.g., at times when MT safety is not a concern) >> operating on the same objects in a non-atomic manner. > > understood. i think the only downside is that if you do have an > _Atomic-specified type you'll have to cast the qualification away > to use the function like macro. > This is tricky, and I can't say I've really converged on an opinion, but it seems to me at this point you shouldn't mark anything _Atomic. > as a suggestion the _Generic legs could include both _Atomic-specified > and non-_Atomic-specified types where an intermediate inline function > could strip the qualification to use your core inline implementations. > > _Generic((v), int *: __foo32, RTE_ATOMIC(int) *: __foo32_unqual)(v)) > > static inline void > __foo32(int *a) { ... } > > static inline void > __foo32_unqual(RTE_ATOMIC(int) *a) { __foo32((int *)(uintptr_t)(a)); } > > there is some similar prior art in newer ISO C23 with typeof_unqual. > > https://en.cppreference.com/w/c/language/typeof > This is an interesting solution, but I'm not sure it's a problem that needs to be solved. >> >> Unlike rte_bit_relaxed_*(), individual bits are represented by bool, >> not uint32_t or uint64_t. The author found the use of such large types >> confusing, and also failed to see any performance benefits. >> >> A set of functions rte_bit_*_assign() are added, to assign a >> particular boolean value to a particular bit. >> >> All new functions have properly documented semantics. >> >> All new functions (or more correctly, generic selection macros) >> operate on both 32 and 64-bit words, with type checking. >> >> _Generic allow the user code to be a little more impact. Have a >> type-generic atomic test/set/clear/assign bit API also seems >> consistent with the "core" (word-size) atomics API, which is generic >> (both GCC builtins and are). > > ack, can you confirm _Generic is usable from a C++ TU? i may be making a > mistake locally but using g++ version 11.4.0 -std=c++20 it wasn't > accepting it. > > i think using _Generic is ideal, it eliminates mistakes when handling > the different integer sizes so if it turns out C++ doesn't want to > cooperate the function like macro can conditionally expand to a C++ > template this will need to be done for MSVC since i can confirm > _Generic does not work with MSVC C++. > That's unfortunate. No, I didn't try it with C++. I just assumed _Generic was C++ as well. The naive solution would be to include two overloaded functions per function-like macro. #ifdef __cplusplus #undef rte_bit_set static inline void rte_bit_set(uint32_t *addr, unsigned int nr) { __rte_bit_set32(addr, nr); } static inline void rte_bit_set(uint64_t *addr, unsigned int nr) { __rte_bit_set64(addr, nr); } #endif Did you have something more clever/less verbose in mind? The best would if one could have a completely generic C++-compatible replacement of _Generic, but it's not obvious how that would work. What's the minimum C++ version required by DPDK? C++11? >> >> The _Generic versions avoids having explicit unsigned long versions of >> all functions. If you have an unsigned long, it's safe to use the >> generic version (e.g., rte_set_bit()) and _Generic will pick the right >> function, provided long is either 32 or 64 bit on your platform (which >> it is on all DPDK-supported ABIs). >> >> The generic rte_bit_set() is a macro, and not a function, but >> nevertheless has been given a lower-case name. That's how C11 does it >> (for atomics, and other _Generic), and . Its address >> can't be taken, but it does not evaluate its parameters more than >> once. >> >> Things that are left out of this patch set, that may be included >> in future versions: >> >> * Have all functions returning a bit number have the same return type >> (i.e., unsigned int). >> * Harmonize naming of some GCC builtin wrappers (i.e., rte_fls_u32()). >> * Add __builtin_ffsll()/ffs() wrapper and potentially other wrappers >> for useful/used bit-level GCC builtins. >> * Eliminate the MSVC #ifdef-induced documentation duplication. >> * _Generic versions of things like rte_popcount32(). (?) > > it would be nice to see them all converted, at the time i added them we > still hadn't adopted C11 so was limited. but certainly not asking for it > as a part of this series. > >> >> Mattias Rönnblom (6): >> eal: extend bit manipulation functionality >> eal: add unit tests for bit operations >> eal: add exactly-once bit access functions >> eal: add unit tests for exactly-once bit access functions >> eal: add atomic bit operations >> eal: add unit tests for atomic bit access functions >> >> app/test/test_bitops.c | 319 +++++++++++++++++- >> lib/eal/include/rte_bitops.h | 624 ++++++++++++++++++++++++++++++++++- >> 2 files changed, 925 insertions(+), 18 deletions(-) >> >> -- > > Series-acked-by: Tyler Retzlaff > >> 2.34.1