From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 996A843F09; Thu, 25 Apr 2024 20:05:16 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 16F8040A67; Thu, 25 Apr 2024 20:05:16 +0200 (CEST) Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by mails.dpdk.org (Postfix) with ESMTP id BC50440282 for ; Thu, 25 Apr 2024 20:05:14 +0200 (CEST) Received: by linux.microsoft.com (Postfix, from userid 1086) id E9E5B210DEF5; Thu, 25 Apr 2024 11:05:13 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com E9E5B210DEF5 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1714068313; bh=2rx/VhP7w65ZDZvsxEpZGJtNihlaCIoIS+6CwbuGw00=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=VvXKmGgGbvbtVJbl2a38JUY6SufiS8sRvGrGWz3wHdhIimOcrNo4PGgm8JuG/0SZT lqmopbyXaxX7w8nGTQ7bvvcweWYb0+AOaCXKrt2uR2Fy/gJxUjAbdmoMmuMpBvQZC/ pxKYyfNevFTi3h6WazHQjSP+HqLD6nDH9ylPBN5s= Date: Thu, 25 Apr 2024 11:05:13 -0700 From: Tyler Retzlaff To: Mattias =?iso-8859-1?Q?R=F6nnblom?= Cc: dev@dpdk.org, hofors@lysator.liu.se, Heng Wang , Stephen Hemminger Subject: Re: [RFC v2 0/6] Improve EAL bit operations API Message-ID: <20240425180513.GA23608@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> References: <20240302135328.531940-2-mattias.ronnblom@ericsson.com> <20240425085853.97888-1-mattias.ronnblom@ericsson.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20240425085853.97888-1-mattias.ronnblom@ericsson.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Thu, Apr 25, 2024 at 10:58:47AM +0200, Mattias Rönnblom wrote: > This patch set represent an attempt to improve and extend the RTE > bitops API, in particular for functions that operate on individual > bits. > > All new functionality is exposed to the user as generic selection > macros, delegating the actual work to private (__-marked) static > inline functions. Public functions (e.g., rte_bit_set32()) would just > be bloating the API. Such generic selection macros will here be > referred to as "functions", although technically they are not. > > The legacy rte_bit_relaxed_*() family of functions is > replaced with three families: > > rte_bit_[test|set|clear|assign]() which provides no memory ordering or > atomicity guarantees and no read-once or write-once semantics (e.g., > no use of volatile), but does provide the best performance. The > performance degradation resulting from the use of volatile (e.g., > forcing loads and stores to actually occur and in the number > specified) and atomic (e.g., LOCK-prefixed instructions on x86) may be > significant. > > rte_bit_once_*() which guarantees program-level load and stores > actually occurring (i.e., prevents certain optimizations). The primary > use of these functions are in the context of memory mapped > I/O. Feedback on the details (semantics, naming) here would be greatly > appreciated, since the author is not much of a driver developer. > > rte_bit_atomic_*() which provides atomic bit-level operations, > including the possibility to specifying memory ordering constraints > (or the lack thereof). > > The atomic functions take non-_Atomic pointers, to be flexible, just > like the GCC builtins and default . The issue with > _Atomic APIs is that it may well be the case that the user wants to > perform both non-atomic and atomic operations on the same word. > > Having _Atomic-marked addresses would complicate supporting atomic > bit-level operations in the bitset API (proposed in a different RFC > patchset), and potentially other APIs depending on RTE bitops for > atomic bit-level ops). Either one needs two bitset variants, one > _Atomic bitset and one non-atomic one, or the bitset code needs to > cast the non-_Atomic pointer to an _Atomic one. Having a separate > _Atomic bitset would be bloat and also prevent the user from both, in > some situations, doing atomic operations against a bit set, while in > other situations (e.g., at times when MT safety is not a concern) > operating on the same objects in a non-atomic manner. understood. i think the only downside is that if you do have an _Atomic-specified type you'll have to cast the qualification away to use the function like macro. as a suggestion the _Generic legs could include both _Atomic-specified and non-_Atomic-specified types where an intermediate inline function could strip the qualification to use your core inline implementations. _Generic((v), int *: __foo32, RTE_ATOMIC(int) *: __foo32_unqual)(v)) static inline void __foo32(int *a) { ... } static inline void __foo32_unqual(RTE_ATOMIC(int) *a) { __foo32((int *)(uintptr_t)(a)); } there is some similar prior art in newer ISO C23 with typeof_unqual. https://en.cppreference.com/w/c/language/typeof > > Unlike rte_bit_relaxed_*(), individual bits are represented by bool, > not uint32_t or uint64_t. The author found the use of such large types > confusing, and also failed to see any performance benefits. > > A set of functions rte_bit_*_assign() are added, to assign a > particular boolean value to a particular bit. > > All new functions have properly documented semantics. > > All new functions (or more correctly, generic selection macros) > operate on both 32 and 64-bit words, with type checking. > > _Generic allow the user code to be a little more impact. Have a > type-generic atomic test/set/clear/assign bit API also seems > consistent with the "core" (word-size) atomics API, which is generic > (both GCC builtins and are). ack, can you confirm _Generic is usable from a C++ TU? i may be making a mistake locally but using g++ version 11.4.0 -std=c++20 it wasn't accepting it. i think using _Generic is ideal, it eliminates mistakes when handling the different integer sizes so if it turns out C++ doesn't want to cooperate the function like macro can conditionally expand to a C++ template this will need to be done for MSVC since i can confirm _Generic does not work with MSVC C++. > > The _Generic versions avoids having explicit unsigned long versions of > all functions. If you have an unsigned long, it's safe to use the > generic version (e.g., rte_set_bit()) and _Generic will pick the right > function, provided long is either 32 or 64 bit on your platform (which > it is on all DPDK-supported ABIs). > > The generic rte_bit_set() is a macro, and not a function, but > nevertheless has been given a lower-case name. That's how C11 does it > (for atomics, and other _Generic), and . Its address > can't be taken, but it does not evaluate its parameters more than > once. > > Things that are left out of this patch set, that may be included > in future versions: > > * Have all functions returning a bit number have the same return type > (i.e., unsigned int). > * Harmonize naming of some GCC builtin wrappers (i.e., rte_fls_u32()). > * Add __builtin_ffsll()/ffs() wrapper and potentially other wrappers > for useful/used bit-level GCC builtins. > * Eliminate the MSVC #ifdef-induced documentation duplication. > * _Generic versions of things like rte_popcount32(). (?) it would be nice to see them all converted, at the time i added them we still hadn't adopted C11 so was limited. but certainly not asking for it as a part of this series. > > Mattias Rönnblom (6): > eal: extend bit manipulation functionality > eal: add unit tests for bit operations > eal: add exactly-once bit access functions > eal: add unit tests for exactly-once bit access functions > eal: add atomic bit operations > eal: add unit tests for atomic bit access functions > > app/test/test_bitops.c | 319 +++++++++++++++++- > lib/eal/include/rte_bitops.h | 624 ++++++++++++++++++++++++++++++++++- > 2 files changed, 925 insertions(+), 18 deletions(-) > > -- Series-acked-by: Tyler Retzlaff > 2.34.1