DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Morten Brørup" <mb@smartsharesystems.com>
To: "Mattias Rönnblom" <hofors@lysator.liu.se>,
	"Stephen Hemminger" <stephen@networkplumber.org>,
	"Mattias Rönnblom" <mattias.ronnblom@ericsson.com>
Cc: <dev@dpdk.org>, "Tyler Retzlaff" <roretzla@linux.microsoft.com>
Subject: RE: [RFC v3] eal: add bitset type
Date: Fri, 2 Feb 2024 13:42:42 +0100	[thread overview]
Message-ID: <98CBD80474FA8B44BF855DF32C47DC35E9F1E3@smartserver.smartshare.dk> (raw)
In-Reply-To: <04c3bc56-0075-46fa-b49c-c6f9b844d400@lysator.liu.se>


> From: Mattias Rönnblom [mailto:hofors@lysator.liu.se]
> Sent: Friday, 2 February 2024 11.19
> 
> On 2024-02-01 09:04, Morten Brørup wrote:
> >> From: Mattias Rönnblom [mailto:hofors@lysator.liu.se]
> >> Sent: Wednesday, 31 January 2024 19.46
> >>
> >> On 2024-01-31 17:06, Stephen Hemminger wrote:
> >>> On Wed, 31 Jan 2024 14:13:01 +0100
> >>> Mattias Rönnblom <mattias.ronnblom@ericsson.com> wrote:
> >
> > [...]
> >
> >>> FYI - the linux kernel has a similar but more complete set of
> >> operations.
> >>> It might be more efficient to use unsigned long rather than
> requiring
> >>> the elements to be uint64_t. Thinking of the few 32 bit platforms.
> >>>
> >>
> >> Keeping it 64-bit avoids a popcount-related #ifdef. DPDK doesn't
> have
> >> an
> >> equivalent to __builtin_popcountl().
> >>
> >> How much do we need to care about 32-bit ISA performance?
> >
> > At the 2023 DPDK Summit I talked to someone at a very well known
> network equipment vendor using 32 bit CPUs in some of their products;
> some sort of CPE, IIRC. 32 bit CPUs are still out there, and 32-bit CPU
> support has not been deprecated in DPDK.
> >
> > For the bitset parameter to functions, you could either use "unsigned
> long*" (as suggested by Stephen), or "void*" (followed by type casting
> inside the functions).
> >
> > If only using this library for the command line argument parser and
> similar, performance is irrelevant. If we foresee using it in the fast
> path, e.g. with the htimer library, we shouldn't tie its API tightly to
> 64 bit.
> >
> 
> I'm not even sure performance will be that much worse. Sure, two
> popcount instead of one. What is probably worse is older ISAs (32- or
> 64-bit, e.g. original x64_64) that lack machine instructions for
> counting set bits of *any* word size.

I'm sorry about being unclear. I didn't mean to suggest supporting *any* word size; I was thinking about one word size, either 32 or 64 bit, automatically selected at build time depending on CPU architecture.

> 
> That said, the only real concern I have about going "unsigned long" ->
> "uint64_t" is that I might feel I need to go fix <rte_bitops.h> first.

I see.
Otherwise you'll end up with a bunch of #if RTE_ARCH_32 rte_bit_<op>32() #else rte_bit_<op>64() #endif in the implementation.
Perhaps a string concatenation macro could replace that with something like rte_bit_<op>##RTE_ARCH_BITS(), or RTE_POSTFIX_ARCH_BITS(rte_bit_<op>, (params)). Just thinking out aloud.

> 
> >>
> >> I'll go through the below API and some other APIs to see if there's
> >> something obvious missing.
> >>
> >> When I originally wrote this code there were a few potential
> features
> >> where I wasn't sure to what extent they were useful. One example was
> >> the
> >> shift operation. Any input is appreciated.
> >
> > Start off with what you already have. If we need more operations,
> they can always be added later.
> >
> >>
> >>> Also, what if any thread safety guarantees? or atomic.
> >>>
> >>
> >> Currently, it's all MT unsafe.
> >>
> >> An atomic set and get/test would make sense, and maybe other
> operations
> >> would as well.
> >>
> >> Bringing in atomicity into the design makes it much less obvious:
> >>
> >> Would the atomic operations imply some memory ordering, or be
> >> "relaxed".
> >> I would lean toward relaxed, but then shouldn't bit-level atomics be
> >> consistent with the core DPDK atomics API? With that in mind, memory
> >> ordering should be user-configurable.
> >>
> >> If the code needs to be pure C11 atomics-wise, the words that makes
> up
> >> the bitset must be _Atomic uint64_t. Then you need to be careful or
> end
> >> up with "lock"-prefixed instructions if you manipulate the bitset
> >> words.
> >> Just a pure words[N] = 0; gives you a mov+mfence on x86, for
> example,
> >> plus all the fun memory_order_seq_cst in terms of preventing
> >> compiler-level optimizations. So you definitely can't have the
> bitset
> >> always using _Atomic uint64_t, since would risk non-shared use
> cases.
> >> You could have a variant I guess. Just duplicate the whole thing, or
> >> something with macros.
> >
> > It seems like MT unsafe suffices for the near term use cases.
> >
> > We can add an atomic variant of the library later, if the need should
> arise.
> >
> 
> Agreed. The only concern I have here is that you end up wanting to
> change the original design, to better be able to fit atomic bit
> operations.

In a perfect world, the design should have a roadmap leading towards atomic bit operations.
In a fast moving world, we could mark the lib experimental (and mean it!) - it is still an improvement over copy-pasting something similar all over the code.

If a potential roadmap towards atomic operations is not obvious after thinking a few moments about it, we have a clear conscience to simply deem the library unsafe for multithreading and proceed with it "as is".


  reply	other threads:[~2024-02-02 12:42 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-31 13:13 Mattias Rönnblom
2024-01-31 16:02 ` Stephen Hemminger
2024-01-31 16:28   ` Mattias Rönnblom
2024-01-31 16:06 ` Stephen Hemminger
2024-01-31 18:45   ` Mattias Rönnblom
2024-02-01  8:04     ` Morten Brørup
2024-02-02 10:19       ` Mattias Rönnblom
2024-02-02 12:42         ` Morten Brørup [this message]
2024-02-16 10:23 ` [RFC v4 1/4] " Mattias Rönnblom
2024-02-16 10:23   ` [RFC v4 2/4] eal: add bitset test suite Mattias Rönnblom
2024-02-16 10:23   ` [RFC v4 3/4] service: use multi-word bitset to represent service flags Mattias Rönnblom
2024-02-16 10:23   ` [RFC v4 4/4] event/dsw: optimize serving port logic Mattias Rönnblom
2024-05-05  7:33   ` [RFC v5 1/6] eal: add bitset type Mattias Rönnblom
2024-05-05  7:33     ` [RFC v5 2/6] eal: add bitset test suite Mattias Rönnblom
2024-05-05  7:33     ` [RFC v5 3/6] eal: add atomic bitset functions Mattias Rönnblom
2024-05-05  7:33     ` [RFC v5 4/6] eal: add unit tests for atomic bitset operations Mattias Rönnblom
2024-05-05  7:33     ` [RFC v5 5/6] service: use multi-word bitset to represent service flags Mattias Rönnblom
2024-05-05  7:33     ` [RFC v5 6/6] event/dsw: optimize serving port logic Mattias Rönnblom

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=98CBD80474FA8B44BF855DF32C47DC35E9F1E3@smartserver.smartshare.dk \
    --to=mb@smartsharesystems.com \
    --cc=dev@dpdk.org \
    --cc=hofors@lysator.liu.se \
    --cc=mattias.ronnblom@ericsson.com \
    --cc=roretzla@linux.microsoft.com \
    --cc=stephen@networkplumber.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).