Hi all,

[Robin] > I had not understood that it was *only* the lookups that were network order
[Morten] >When I saw the byte order flag the first time, it was not clear to me either that it only affected lookups - I too thought it covered the entire API of the library. This needs to be emphasized in the description of the flag. And the flag's name should contain LOOKUP
[Morten] > And/or rename RTE_FIB_F_NETWORK_ORDER to RTE_FIB_F_NETWORK_ORDER_LOOKUP or similar.

There is a clear comment for this flag that it has effects on lookup. Repeating the statement with an exclamation mark seems too much. Moreover, at first this flag was named "RTE_FIB_FLAG_LOOKUP_BE" and it was suggested for renaming here:
https://inbox.dpdk.org/dev/D4SWPKOPRD5Z.87YIET3Y4AW@redhat.com/

[Morten] >Control plane API should use CPU byte order ... adding it (support for network byte order) to the RIB library would be nice too.
I'm not sure if I understood you correctly here, RIB is a control plane library.

[Robin] > an IPv4 address is *not* an integer. It should be treated as an opaque value.
I don't agree here. IPv4 is 32 bits of information. CPUs usually can treat 32 bits of information as an integer, which is really useful.

[Morten] > Treating IPv4 addresses as byte arrays would allow simple memcmp() for range comparison
How is it possible for a general case? For example, I need to test IP addresses against range 1.1.1.7 - 10.20.30.37.

[Robin] >Also for consistency with IPv6, I really think that *all* addresses should be dealt in their network form.
There is no such a problem as byte order mismatch for IPv6 addresses since they can not be treated by modern CPUs as an native integer type.

[Robin] >But it (RTE_IPV4) will always generate addresses in *host order*. Which means they cannot be used in IPv4 headers without passing them through htonl().
RTE_IPV4 is not limited by setting IPv4 headers values.

[Robin] >Maybe we could revert that patch and defer a complete change of the rib/fib APIs to only expose network order addresses?
I don't agree with that. Don't limit yourself to just manipulating network headers.

[Robin] >Thinking about it some more. Having a flag for such a drastic change in behaviour does not seem right.
This flag is optional. I don't see any problems with that.

In general, here we just have different perspectives on the problem. I can see and understand your point.
My considerations are:
- The vast majority of the longest prefix match algorithms works with addresses in host byte order (binary trees, multibit tries, DXR, except only hash based lookup)
- If you do byteswap two or more times - If you run byteswap two or more times, you are probably doing something wrong in terms of computations

So, feel free to submit patches adding this feature to the control plane API, but let's consider:
- default behaviour should remain the same. Why? At least because for my usecases I'd like to have "data representation" (byte swap) outside of the library. Not to mention ABI/API breakage
-  IPv4 should stay as uint32_t. C doesn't know such a thing as byte order, it knows about size and signedness. rte_be32_t is just a hint for us - humans :)


пт, 15 нояб. 2024 г. в 17:00, Stephen Hemminger <stephen@networkplumber.org>:
On Fri, 15 Nov 2024 15:28:33 +0100
"Robin Jarry" <rjarry@redhat.com> wrote:

> Morten Brørup, Nov 15, 2024 at 14:52:
> > Robin, you've totally won me over on this endian discussion. :-)
> > Especially the IPv6 comparison make it clear why IPv4 should also be
> > network byte order.
> >
> > API/ABI stability is a pain... we're stuck with host endian IPv4
> > addresses; e.g. for the RTE_IPV4() macro, which I now agree produces
> > the wrong endian value (on little endian CPUs). 
>
> At least for 24.11 it is too late. But maybe we could make it right for
> the next LTS?
>
> >> Vladimir, could we at least consider adding a real network order mode
> >> for the rib and fib libraries? So that we can have consistent APIs
> >> between IPv4 and IPv6? 
> >
> > And/or rename RTE_FIB_F_NETWORK_ORDER to
> > RTE_FIB_F_NETWORK_ORDER_LOOKUP or similar. This is important if real
> > network order mode is added (now or later)! 
>
> Maybe we could revert that patch and defer a complete change of the
> rib/fib APIs to only expose network order addresses? It would be an ABI
> breakage but if properly announced in advance, it should be possible.
>
> Thinking about it some more. Having a flag for such a drastic change in
> behaviour does not seem right.

It was a mistake for DPDK to define its own data structures for IP addresses.
Would have been much better to stick with the legacy what BSD, Linux (and Windows)
uses in API. 'struct in_addr' and 'struct in6_addr'

Reinvention did not help users.


--
Regards,
Vladimir