Hi Robin,

Thanks, that is a good idea.




чт, 18 июл. 2024 г. в 21:27, Morten Brørup <mb@smartsharesystems.com>:
> From: Robin Jarry [mailto:rjarry@redhat.com]
>
> Hi folks,
>
> while working on IPv6 support for grout [1], I noticed that all DPDK
> IPv6 APIs used fixed sized arrays in the route lookup functions [2].
>
>  int rte_fib6_lookup_bulk(struct rte_fib6 *fib,
>                           uint8_t ips[][RTE_FIB6_IPV6_ADDR_SIZE],
>                           uint64_t *next_hops,
>                           int n);
>
> If I'm not mistaken, using sized arrays in function signatures is only
> for documentation purposes and does not result in any specific compiler
> checks. In the above example, the ips parameter is considered as a plain
> old `uint8_t **` pointer.
>
> Also, not having a dedicated type for IPv6 addresses requires obscure
> pointer arithmetic [3] and casting [4].
>
> I'd like to introduce a real IPv6 address structure that has the same
> alignment than a dumb `uint8_t *` pointer but has an union to ease
> casting and most importantly presents the whole thing as an explicit
> typed structure:
>
>     #define RTE_IPV6_ADDR_SIZE 16
>
>     struct rte_ipv6_addr {
>         union {
>             uint8_t u8[RTE_IPV6_ADDR_SIZE];
>             uint16_t u16[RTE_IPV6_ADDR_SIZE / sizeof(uint16_t)];
>             uint32_t u32[RTE_IPV6_ADDR_SIZE / sizeof(uint32_t)];
>             uint64_t u64[RTE_IPV6_ADDR_SIZE / sizeof(uint64_t)];
>         };
>     } __rte_packed __rte_aligned(1);
>
> This would require some breakage of the APIs but I think it would
> benefit code readability and maintainability in the long term.

In short: Although I like the idea of a unified IPv6 address type very much, I'm not sure consensus can be reached about the optimal alignment of such a type.

The long version:

Please consider this proposal in a broader perspective.

The IPv4 FIB lookup takes an uint32_t array, so the IPv4 address type here is 4 byte aligned: uint32_t *ips
Generally, uint32_t or rte_be32_t is used for IPv4 addresses, and both these types are 4 byte aligned. In other words: IPv4 addresses are considered 4 byte aligned by DPDK.

I don't think it is similarly simple for IPv6 addresses.

The alignment of IPv6 addresses may depend on how/where they are used, e.g.:
1. For the FIB library, it might be good for vector implementations to have the IPv6 addresses naturally aligned (i.e. 16 byte aligned), like the uint128_t/__int128/__m128i type (or the rte_xmm_t type [XMM]). Furthermore, a simple integer type (uint128_t equivalent) might be preferable in this API.

I think alignment should be 1 since in FIB6 users usually don't copy IPv6 address and just provide a pointer to the memory inside the packet. Current vector implementation loads IPv6 addresses using unaligned access (_mm512_loadu_si512) so it doesn't rely on alignment.
 
2. In the IPv6 packet header, the IPv6 addresses are not 16 byte aligned, they are 8 byte aligned. So we cannot make the IPv6 address type 16 byte aligned.
Not necessary, if Ethernet frame in mbuf starts on 8b aligned address, then IPv6 is aligned only by 2 bytes.


I fear that broadly dumbing down the IPv6 address type to always use 1 byte alignment could potentially introduce unwanted performance penalties (now or in the future). We didn't do it for IPv4 addresses, so let's not do it for IPv6 addresses.

Perhaps we could use the lowest "non-exotic" (considering the use of IPv6 addresses) alignment, which I would guess is 8 byte - as in the IPv6 packet header.
For reference, Ethernet addresses are defined as 2 byte aligned [ETH].

[XMM]: https://elixir.bootlin.com/dpdk/v24.03/source/lib/eal/x86/include/rte_vect.h#L42
[ETH]: https://elixir.bootlin.com/dpdk/v24.07-rc2/source/lib/net/rte_ether.h#L74

>
>  int rte_fib6_lookup_bulk(struct rte_fib6 *fib,
>                           const struct rte_ipv6_addr *ips,
>                           uint64_t *next_hops,
>                           int n);
>
> I already have a semi-working draft and am in the process of splitting
> the changes into small chunks to make them easier to review.
>
> https://github.com/DPDK/dpdk/compare/main...rjarry:dpdk:ipv6-address-
> rework
>
> Is that something that would be of interest? If yes, I would like to
> announce API breakage before the release of 24.07 so that the changes
> can be integrated into 24.11.
>
> Cheers!
>
> [1] https://github.com/rjarry/grout
> [2]
> https://doc.dpdk.org/api/rte__fib6_8h.html#a924678410ccb9551cda3e75d742a
> 11e3
> [3] https://git.dpdk.org/dpdk/tree/lib/fib/trie_avx512.c?h=v24.07-
> rc2#n340
> [4] https://git.dpdk.org/dpdk/tree/lib/hash/rte_thash.h?h=v24.07-
> rc2#n156
>
> --
> Robin



--
Regards,
Vladimir