From: "Robin Jarry" <rjarry@redhat.com>
To: "Morten Brørup" <mb@smartsharesystems.com>,
"Medvedkin, Vladimir" <vladimir.medvedkin@intel.com>,
dev@dpdk.org
Subject: Re: rte_fib network order bug
Date: Thu, 14 Nov 2024 11:18:34 +0100 [thread overview]
Message-ID: <D5LTPJ4FK5O0.1C6RBQKKUU055@redhat.com> (raw)
In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC35E9F8CB@smartserver.smartshare.dk>
Hi folks,
Morten Brørup, Nov 14, 2024 at 08:43:
> Medvedkin, Vladimir:
>> I think control plane API should work with prefix addresses in CPU
>> byte order. At least our RTE_IPV4 macro works this way. Also, prefix
>> is an address + prefix length (not the mask), so it is more natural
>> if address is in cpu byte order.
This may get into a religion debate, but in my opinion, an IPv4 address
is *not* an integer. It should be treated as an opaque value. RTE_IPV4
is only useful to define addresses in unit tests.
I do not know of any IPv4 stack implementation that deals with
*host order* addresses. Here are a couple of examples where all
addresses are stored in network order in the control plane:
https://elixir.bootlin.com/linux/v6.11.6/source/net/ipv4/fib_frontend.c#L1069
https://github.com/freebsd/freebsd-src/blob/release/14.1.0/sys/net/route/route_ctl.c#L692
https://git.fd.io/vpp/tree/src/vnet/fib/fib_table.c?h=v24.10#n237
>>
>> Also, I think byte swap should be done on the interface where byte
>> order changes, and this boundary lies outside the FIB library.
>> However, I've added this feature not only because it was asked, but
>> also trying to improve performance in some cases, such as using
>> AVX512 byte swap in vector path for users who don't want to bother
>> about manually do byteswap on the fast path.
>>
>> Why do you think this would discourage users?
>
> Joining the discussing with a couple of comments.
>
> 1. When I saw the byte order flag the first time, it was not clear to
> me either that it only affected lookups - I too thought it covered
> the entire API of the library. This needs to be emphasized in the
> description of the flag. And the flag's name should contain LOOKUP,
> e.g.:
>
> /** If set, FIB lookup is expecting IPv4 address in network byte order. Note: Only lookup! */
> #define RTE_FIB_F_LOOKUP_NETWORK_ORDER 1
>
> 2. Control plane API should use CPU byte order. I consider inet_pton()
> irrelevant in this context. Adding network byte order lookup for
> fast path optimization makes good sense, and adding it to the RIB
> library would be nice too.
>
> If it was an address table (not longest prefix table, but a hash or
> similar), the learn()/update() function could be fast path, and
> thus support network byte order too; but it's not, so add() is
> control plane.
Why would control plane use a different representation of addresses
compared to data plane? Also for consistency with IPv6, I really think
that *all* addresses should be dealt in their network form.
Cheers.
next prev parent reply other threads:[~2024-11-14 10:18 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-12 9:31 Robin Jarry
2024-11-13 10:42 ` Medvedkin, Vladimir
2024-11-13 13:27 ` Robin Jarry
2024-11-13 19:39 ` Medvedkin, Vladimir
2024-11-14 7:43 ` Morten Brørup
2024-11-14 10:18 ` Robin Jarry [this message]
2024-11-14 14:35 ` Morten Brørup
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=D5LTPJ4FK5O0.1C6RBQKKUU055@redhat.com \
--to=rjarry@redhat.com \
--cc=dev@dpdk.org \
--cc=mb@smartsharesystems.com \
--cc=vladimir.medvedkin@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).