DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Mattias Rönnblom" <hofors@lysator.liu.se>
To: "Stephen Hemminger" <stephen@networkplumber.org>,
	"Mattias Rönnblom" <mattias.ronnblom@ericsson.com>
Cc: dev@dpdk.org, "Morten Brørup" <mb@smartsharesystems.com>,
	"Tyler Retzlaff" <roretzla@linux.microsoft.com>
Subject: Re: [RFC v3] eal: add bitset type
Date: Wed, 31 Jan 2024 19:45:34 +0100	[thread overview]
Message-ID: <23445b74-50b2-4e7a-a6d8-b844815031fb@lysator.liu.se> (raw)
In-Reply-To: <20240131080643.41a03cd8@hermes.local>

On 2024-01-31 17:06, Stephen Hemminger wrote:
> On Wed, 31 Jan 2024 14:13:01 +0100
> Mattias Rönnblom <mattias.ronnblom@ericsson.com> wrote:
> 
>> +/**
>> + * @file
>> + * RTE Bitset
>> + *
>> + * This file provides functions and macros for querying and
>> + * manipulating sets of bits kept in arrays of @c uint64_t-sized
>> + * elements.
>> + *
>> + * The bits in a bitset are numbered from 0 to @c size - 1, with the
>> + * lowest index being the least significant bit.
>> + *
>> + * The bitset array must be properly aligned.
>> + *
>> + * For optimal performance, the @c size parameter, required by
>> + * many of the API's functions, should be a compile-time constant.
>> + *
>> + * For large bitsets, the rte_bitmap.h API may be more appropriate.
>> + *
>> + * @warning
>> + * All functions modifying a bitset may overwrite any unused bits of
>> + * the last word. Such unused bits are ignored by all functions reading
>> + * bits.
>> + *
>> + */
> 
> FYI - the linux kernel has a similar but more complete set of operations.
> It might be more efficient to use unsigned long rather than requiring
> the elements to be uint64_t. Thinking of the few 32 bit platforms.
> 

Keeping it 64-bit avoids a popcount-related #ifdef. DPDK doesn't have an 
equivalent to __builtin_popcountl().

How much do we need to care about 32-bit ISA performance?

I'll go through the below API and some other APIs to see if there's 
something obvious missing.

When I originally wrote this code there were a few potential features 
where I wasn't sure to what extent they were useful. One example was the 
shift operation. Any input is appreciated.

> Also, what if any thread safety guarantees? or atomic.
> 

Currently, it's all MT unsafe.

An atomic set and get/test would make sense, and maybe other operations 
would as well.

Bringing in atomicity into the design makes it much less obvious:

Would the atomic operations imply some memory ordering, or be "relaxed". 
I would lean toward relaxed, but then shouldn't bit-level atomics be 
consistent with the core DPDK atomics API? With that in mind, memory 
ordering should be user-configurable.

If the code needs to be pure C11 atomics-wise, the words that makes up 
the bitset must be _Atomic uint64_t. Then you need to be careful or end 
up with "lock"-prefixed instructions if you manipulate the bitset words. 
Just a pure words[N] = 0; gives you a mov+mfence on x86, for example, 
plus all the fun memory_order_seq_cst in terms of preventing 
compiler-level optimizations. So you definitely can't have the bitset 
always using _Atomic uint64_t, since would risk non-shared use cases. 
You could have a variant I guess. Just duplicate the whole thing, or 
something with macros.

With GCC C11 builtins, you can both have the atomic cake and eat it, in 
that you both access the data non-atomically/normally, and in an atomic 
manner.

>  From kernel bitmap.h
> 
> /**
>   * DOC: bitmap overview
>   *
>   * The available bitmap operations and their rough meaning in the
>   * case that the bitmap is a single unsigned long are thus:
>   *
>   * The generated code is more efficient when nbits is known at
>   * compile-time and at most BITS_PER_LONG.
>   *
>   * ::
>   *
>   *  bitmap_zero(dst, nbits)                     *dst = 0UL
>   *  bitmap_fill(dst, nbits)                     *dst = ~0UL
>   *  bitmap_copy(dst, src, nbits)                *dst = *src
>   *  bitmap_and(dst, src1, src2, nbits)          *dst = *src1 & *src2
>   *  bitmap_or(dst, src1, src2, nbits)           *dst = *src1 | *src2
>   *  bitmap_xor(dst, src1, src2, nbits)          *dst = *src1 ^ *src2
>   *  bitmap_andnot(dst, src1, src2, nbits)       *dst = *src1 & ~(*src2)
>   *  bitmap_complement(dst, src, nbits)          *dst = ~(*src)
>   *  bitmap_equal(src1, src2, nbits)             Are *src1 and *src2 equal?
>   *  bitmap_intersects(src1, src2, nbits)        Do *src1 and *src2 overlap?
>   *  bitmap_subset(src1, src2, nbits)            Is *src1 a subset of *src2?
>   *  bitmap_empty(src, nbits)                    Are all bits zero in *src?
>   *  bitmap_full(src, nbits)                     Are all bits set in *src?
>   *  bitmap_weight(src, nbits)                   Hamming Weight: number set bits
>   *  bitmap_weight_and(src1, src2, nbits)        Hamming Weight of and'ed bitmap
>   *  bitmap_set(dst, pos, nbits)                 Set specified bit area
>   *  bitmap_clear(dst, pos, nbits)               Clear specified bit area
>   *  bitmap_find_next_zero_area(buf, len, pos, n, mask)  Find bit free area
>   *  bitmap_find_next_zero_area_off(buf, len, pos, n, mask, mask_off)  as above
>   *  bitmap_shift_right(dst, src, n, nbits)      *dst = *src >> n
>   *  bitmap_shift_left(dst, src, n, nbits)       *dst = *src << n
>   *  bitmap_cut(dst, src, first, n, nbits)       Cut n bits from first, copy rest
>   *  bitmap_replace(dst, old, new, mask, nbits)  *dst = (*old & ~(*mask)) | (*new & *mask)
>   *  bitmap_remap(dst, src, old, new, nbits)     *dst = map(old, new)(src)
>   *  bitmap_bitremap(oldbit, old, new, nbits)    newbit = map(old, new)(oldbit)
>   *  bitmap_onto(dst, orig, relmap, nbits)       *dst = orig relative to relmap
>   *  bitmap_fold(dst, orig, sz, nbits)           dst bits = orig bits mod sz
>   *  bitmap_parse(buf, buflen, dst, nbits)       Parse bitmap dst from kernel buf
>   *  bitmap_parse_user(ubuf, ulen, dst, nbits)   Parse bitmap dst from user buf
>   *  bitmap_parselist(buf, dst, nbits)           Parse bitmap dst from kernel buf
>   *  bitmap_parselist_user(buf, dst, nbits)      Parse bitmap dst from user buf
>   *  bitmap_find_free_region(bitmap, bits, order)  Find and allocate bit region
>   *  bitmap_release_region(bitmap, pos, order)   Free specified bit region
>   *  bitmap_allocate_region(bitmap, pos, order)  Allocate specified bit region
>   *  bitmap_from_arr32(dst, buf, nbits)          Copy nbits from u32[] buf to dst
>   *  bitmap_from_arr64(dst, buf, nbits)          Copy nbits from u64[] buf to dst
>   *  bitmap_to_arr32(buf, src, nbits)            Copy nbits from buf to u32[] dst
>   *  bitmap_to_arr64(buf, src, nbits)            Copy nbits from buf to u64[] dst
>   *  bitmap_get_value8(map, start)               Get 8bit value from map at start
>   *  bitmap_set_value8(map, value, start)        Set 8bit value to map at start
>   *
>   * Note, bitmap_zero() and bitmap_fill() operate over the region of
>   * unsigned longs, that is, bits behind bitmap till the unsigned long
>   * boundary will be zeroed or filled as well. Consider to use
>   * bitmap_clear() or bitmap_set() to make explicit zeroing or filling
>   * respectively.
>   */
> 

  reply	other threads:[~2024-01-31 18:45 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-31 13:13 Mattias Rönnblom
2024-01-31 16:02 ` Stephen Hemminger
2024-01-31 16:28   ` Mattias Rönnblom
2024-01-31 16:06 ` Stephen Hemminger
2024-01-31 18:45   ` Mattias Rönnblom [this message]
2024-02-01  8:04     ` Morten Brørup
2024-02-02 10:19       ` Mattias Rönnblom
2024-02-02 12:42         ` Morten Brørup
2024-02-16 10:23 ` [RFC v4 1/4] " Mattias Rönnblom
2024-02-16 10:23   ` [RFC v4 2/4] eal: add bitset test suite Mattias Rönnblom
2024-02-16 10:23   ` [RFC v4 3/4] service: use multi-word bitset to represent service flags Mattias Rönnblom
2024-02-16 10:23   ` [RFC v4 4/4] event/dsw: optimize serving port logic Mattias Rönnblom
2024-05-05  7:33   ` [RFC v5 1/6] eal: add bitset type Mattias Rönnblom
2024-05-05  7:33     ` [RFC v5 2/6] eal: add bitset test suite Mattias Rönnblom
2024-05-05  7:33     ` [RFC v5 3/6] eal: add atomic bitset functions Mattias Rönnblom
2024-05-05  7:33     ` [RFC v5 4/6] eal: add unit tests for atomic bitset operations Mattias Rönnblom
2024-05-05  7:33     ` [RFC v5 5/6] service: use multi-word bitset to represent service flags Mattias Rönnblom
2024-05-05  7:33     ` [RFC v5 6/6] event/dsw: optimize serving port logic Mattias Rönnblom

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=23445b74-50b2-4e7a-a6d8-b844815031fb@lysator.liu.se \
    --to=hofors@lysator.liu.se \
    --cc=dev@dpdk.org \
    --cc=mattias.ronnblom@ericsson.com \
    --cc=mb@smartsharesystems.com \
    --cc=roretzla@linux.microsoft.com \
    --cc=stephen@networkplumber.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).