* fib{,6}: questions and proposals
@ 2024-03-19 8:30 Robin Jarry
2024-03-19 17:16 ` Medvedkin, Vladimir
0 siblings, 1 reply; 5+ messages in thread
From: Robin Jarry @ 2024-03-19 8:30 UTC (permalink / raw)
To: Vladimir Medvedkin; +Cc: dev, Bruce Richardson
Hi Vladimir,
I have been using rte_fib for a while and stumbled upon a few quirks.
I was wondering if you would answer some questions:
1) Is it OK/safe to share the same fib to perform route lookups from
multiple lcores in parallel? So far my observations seem to validate
that assumption but I would like your opinion :)
2) Is it OK/safe to modify a fib from a control thread (read/write)
while it is used by data path threads (read only)?
3) There is no public API to list/walk all configured routes in a fib.
Would that be possible/easy to implement?
4) In rte_fib, every IPv4 address (route *and* next hop) needs to be in
host order. This is not consistent with fib6 where addresses are
stored in network order. It took me quite a while to figure out what
was wrong with my code.
I assume this is because DIR24 needs host order integers and not
TRIE. Why was this not hidden in the API?
Could we add a flag to rte_fib_conf to change the behaviour? This
would avoid error prone ntohl/htonl juggling.
Thanks in advance for your replies :)
--
Robin
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: fib{,6}: questions and proposals
2024-03-19 8:30 fib{,6}: questions and proposals Robin Jarry
@ 2024-03-19 17:16 ` Medvedkin, Vladimir
2024-03-19 20:38 ` Robin Jarry
0 siblings, 1 reply; 5+ messages in thread
From: Medvedkin, Vladimir @ 2024-03-19 17:16 UTC (permalink / raw)
To: Robin Jarry; +Cc: dev, Bruce Richardson
Hi Robin,
On 19/03/2024 08:30, Robin Jarry wrote:
> Hi Vladimir,
>
> I have been using rte_fib for a while and stumbled upon a few quirks.
> I was wondering if you would answer some questions:
>
> 1) Is it OK/safe to share the same fib to perform route lookups from
> multiple lcores in parallel? So far my observations seem to validate
> that assumption but I would like your opinion :)
Yes, 100% :)
>
> 2) Is it OK/safe to modify a fib from a control thread (read/write)
> while it is used by data path threads (read only)?
This part is a bit more complicated. In practice, I would say yes,
however, there is a possibility that if the lookup thread is preempted
in the middle of the lookup process, and at the same time the control
thread deletes the corresponding route, then the lookup result may
return outdated data. This problem is solved in LPM with RCU enabled. I
have plans to implement it in the near future in the FIB.
>
> 3) There is no public API to list/walk all configured routes in a fib.
> Would that be possible/easy to implement?
Yes, it already there. FIB under the hood uses rte_rib to hold existing
routes. So walking through can be implemented like:
struct rte_fib fib;
....
struct rte_rib rib = rte_fib_get_rib(fib);
struct rte_rib_node *cur = NULL;
do {
cur = rte_rib_get_nxt(rib, RTE_IPV4(0,0,0,0) /*this is supernet where
you'd like to iterate*/, 0 /*and this is depth*/, cur, RTE_RIB_GET_NXT_ALL);
if (cur)
printf...
} while (cur)
>
> 4) In rte_fib, every IPv4 address (route *and* next hop) needs to be
> in host order. This is not consistent with fib6 where addresses are
> stored in network order. It took me quite a while to figure out what
> was wrong with my code.
>
> I assume this is because DIR24 needs host order integers and not
> TRIE. Why was this not hidden in the API?
>
> Could we add a flag to rte_fib_conf to change the behaviour? This
> would avoid error prone ntohl/htonl juggling.
This API behavior was created in such a way that it is the same as LPM.
As for LPM, I think it was done this way for performance reasons because
in some scenarios you only working with the host order ipv4 addresses.
>
> Thanks in advance for your replies :)
>
--
Regards,
Vladimir
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: fib{,6}: questions and proposals
2024-03-19 17:16 ` Medvedkin, Vladimir
@ 2024-03-19 20:38 ` Robin Jarry
2024-03-20 7:45 ` Morten Brørup
2024-07-25 17:22 ` Medvedkin, Vladimir
0 siblings, 2 replies; 5+ messages in thread
From: Robin Jarry @ 2024-03-19 20:38 UTC (permalink / raw)
To: Medvedkin, Vladimir; +Cc: dev, Bruce Richardson
Hi Vladimir,
Medvedkin, Vladimir, Mar 19, 2024 at 18:16:
> > 2) Is it OK/safe to modify a fib from a control thread (read/write)
> > while it is used by data path threads (read only)?
>
> This part is a bit more complicated. In practice, I would say yes,
> however, there is a possibility that if the lookup thread is preempted
> in the middle of the lookup process, and at the same time the control
> thread deletes the corresponding route, then the lookup result may
> return outdated data. This problem is solved in LPM with RCU enabled.
> I have plans to implement it in the near future in the FIB.
OK that's good to know, thanks.
> > 3) There is no public API to list/walk all configured routes in
> > a fib. Would that be possible/easy to implement?
>
> Yes, it already there. FIB under the hood uses rte_rib to hold
> existing routes. So walking through can be implemented like:
I had tried it and got confusing results out of this. This must have
been before I had realized that all addresses needed to be in host
order...
I tried again and it works as advertised with a small missing detail:
after configuring a default route, e.g.:
rte_fib_add(fib, RTE_IPV4(2, 2, 0, 0), 16, RTE_IPV4(1, 2, 3, 4));
rte_fib_add(fib, RTE_IPV4(3, 3, 3, 0), 24, RTE_IPV4(4, 3, 2, 1));
rte_fib_add(fib, RTE_IPV4(0, 0, 0, 0), 0, RTE_IPV4(9, 9, 9, 9));
It is not returned by rte_rib_get_nxt() successive calls. I only see the
other two routes:
2.2.0.0/16 via 1.2.3.4
3.3.3.0/24 via 4.3.2.1
Is this expected?
> > 4) In rte_fib, every IPv4 address (route *and* next hop) needs to be
> > in host order. This is not consistent with fib6 where addresses
> > are stored in network order. It took me quite a while to figure
> > out what was wrong with my code.
>
> This API behavior was created in such a way that it is the same as
> LPM.
>
> As for LPM, I think it was done this way for performance reasons
> because in some scenarios you only working with the host order ipv4
> addresses.
This should really be advertised in strong capital letters in the API
docs. Or (preferably) hidden to the user. I don't see any valid scenario
where you would work with host order IPv4 addresses.
Do you think we could change that API or at least add a flag at FIB/RIB
creation to make it transparent to the user and consistent between IPv4
and IPv6?
Thanks!
^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: fib{,6}: questions and proposals
2024-03-19 20:38 ` Robin Jarry
@ 2024-03-20 7:45 ` Morten Brørup
2024-07-25 17:22 ` Medvedkin, Vladimir
1 sibling, 0 replies; 5+ messages in thread
From: Morten Brørup @ 2024-03-20 7:45 UTC (permalink / raw)
To: Robin Jarry, Medvedkin, Vladimir; +Cc: dev, Bruce Richardson
> From: Robin Jarry [mailto:rjarry@redhat.com]
> Sent: Tuesday, 19 March 2024 21.39
>
> Hi Vladimir,
>
> Medvedkin, Vladimir, Mar 19, 2024 at 18:16:
[...]
> > > 4) In rte_fib, every IPv4 address (route *and* next hop) needs to be
> > > in host order. This is not consistent with fib6 where addresses
> > > are stored in network order. It took me quite a while to figure
> > > out what was wrong with my code.
> >
> > This API behavior was created in such a way that it is the same as
> > LPM.
> >
> > As for LPM, I think it was done this way for performance reasons
> > because in some scenarios you only working with the host order ipv4
> > addresses.
>
> This should really be advertised in strong capital letters in the API
> docs. Or (preferably) hidden to the user. I don't see any valid scenario
> where you would work with host order IPv4 addresses.
>
> Do you think we could change that API or at least add a flag at FIB/RIB
> creation to make it transparent to the user and consistent between IPv4
> and IPv6?
I agree that it's weird and inconsistent to work with IPv6 addrs in network order, and not do the same for IPv4 addrs.
We should treat IPv4 addrs like IPv6 addrs, instead of dragging around pre-IPv6 legacy host endian IPv4 addresses.
Using a mix of network order and host order for IPv4 addrs is likely to cause bugs.
I would love to see that fixed across all of DPDK, but I suppose API breakage prevents it. :-(
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: fib{,6}: questions and proposals
2024-03-19 20:38 ` Robin Jarry
2024-03-20 7:45 ` Morten Brørup
@ 2024-07-25 17:22 ` Medvedkin, Vladimir
1 sibling, 0 replies; 5+ messages in thread
From: Medvedkin, Vladimir @ 2024-07-25 17:22 UTC (permalink / raw)
To: Robin Jarry; +Cc: dev, Bruce Richardson
[-- Attachment #1: Type: text/plain, Size: 3397 bytes --]
Hi Robin,
Apologies for the delayed response
On 19/03/2024 20:38, Robin Jarry wrote:
> Hi Vladimir,
>
> Medvedkin, Vladimir, Mar 19, 2024 at 18:16:
>> > 2) Is it OK/safe to modify a fib from a control thread (read/write)
>> > while it is used by data path threads (read only)?
>>
>> This part is a bit more complicated. In practice, I would say yes,
>> however, there is a possibility that if the lookup thread is
>> preempted in the middle of the lookup process, and at the same time
>> the control thread deletes the corresponding route, then the lookup
>> result may return outdated data. This problem is solved in LPM with
>> RCU enabled. I have plans to implement it in the near future in the FIB.
>
> OK that's good to know, thanks.
>
>> > 3) There is no public API to list/walk all configured routes in
>> > a fib. Would that be possible/easy to implement?
>>
>> Yes, it already there. FIB under the hood uses rte_rib to hold
>> existing routes. So walking through can be implemented like:
>
> I had tried it and got confusing results out of this. This must have
> been before I had realized that all addresses needed to be in host
> order...
>
> I tried again and it works as advertised with a small missing detail:
> after configuring a default route, e.g.:
>
> rte_fib_add(fib, RTE_IPV4(2, 2, 0, 0), 16, RTE_IPV4(1, 2, 3, 4));
> rte_fib_add(fib, RTE_IPV4(3, 3, 3, 0), 24, RTE_IPV4(4, 3, 2, 1));
> rte_fib_add(fib, RTE_IPV4(0, 0, 0, 0), 0, RTE_IPV4(9, 9, 9, 9));
>
> It is not returned by rte_rib_get_nxt() successive calls. I only see
> the other two routes:
>
> 2.2.0.0/16 via 1.2.3.4
> 3.3.3.0/24 via 4.3.2.1
>
> Is this expected?
Yes, it is expected. It is also reflected in API: "Retrieve next more
specific prefix ...". So, in your case you should explicitly lookup 0/0
route.
IfindthismoreconvenientfordataplanestructureslikeDIR24-8,whereIneedto
findgaps forsomegivensuperprefix.
>
>> > 4) In rte_fib, every IPv4 address (route *and* next hop) needs to
>> be > in host order. This is not consistent with fib6 where
>> addresses > are stored in network order. It took me quite a while
>> to figure > out what was wrong with my code.
>> This API behavior was created in such a way that it is the same as LPM.
>>
>> As for LPM, I think it was done this way for performance reasons
>> because in some scenarios you only working with the host order ipv4
>> addresses.
>
> This should really be advertised in strong capital letters in the API
> docs. Or (preferably) hidden to the user. I don't see any valid
> scenario where you would work with host order IPv4 addresses.
I just implemented lookup the same way as LPM. As for valid scenario,
years ago I used an LPM/FIB lookup on a huge text log file(it was nginx
logs if I remember correctly) with hundreds of million lines with IP
addresses to resolve corresponding AS numbers for some statistics. The
macro I used converted substrings with IPv4 into unsigned integers in
host byte order. So, it is not always true that IPv4 are in network byte
order.
>
> Do you think we could change that API or at least add a flag at
> FIB/RIB creation to make it transparent to the user and consistent
> between IPv4 and IPv6?
Yes, I will add FIB configuration option to allow BE IPv4 as an input
for lookup function.
>
> Thanks!
>
--
Regards,
Vladimir
[-- Attachment #2: Type: text/html, Size: 8120 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-07-25 17:30 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-19 8:30 fib{,6}: questions and proposals Robin Jarry
2024-03-19 17:16 ` Medvedkin, Vladimir
2024-03-19 20:38 ` Robin Jarry
2024-03-20 7:45 ` Morten Brørup
2024-07-25 17:22 ` Medvedkin, Vladimir
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).