DPDK patches and discussions
 help / color / mirror / Atom feed
* fib{,6}: questions and proposals
@ 2024-03-19  8:30 Robin Jarry
  2024-03-19 17:16 ` Medvedkin, Vladimir
  0 siblings, 1 reply; 5+ messages in thread
From: Robin Jarry @ 2024-03-19  8:30 UTC (permalink / raw)
  To: Vladimir Medvedkin; +Cc: dev, Bruce Richardson

Hi Vladimir,

I have been using rte_fib for a while and stumbled upon a few quirks. 
I was wondering if you would answer some questions:

1) Is it OK/safe to share the same fib to perform route lookups from 
   multiple lcores in parallel? So far my observations seem to validate 
   that assumption but I would like your opinion :)

2) Is it OK/safe to modify a fib from a control thread (read/write) 
   while it is used by data path threads (read only)?

3) There is no public API to list/walk all configured routes in a fib. 
   Would that be possible/easy to implement?

4) In rte_fib, every IPv4 address (route *and* next hop) needs to be in 
   host order. This is not consistent with fib6 where addresses are 
   stored in network order. It took me quite a while to figure out what 
   was wrong with my code.

   I assume this is because DIR24 needs host order integers and not 
   TRIE. Why was this not hidden in the API?

   Could we add a flag to rte_fib_conf to change the behaviour? This 
   would avoid error prone ntohl/htonl juggling.

Thanks in advance for your replies :)

-- 
Robin


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: fib{,6}: questions and proposals
  2024-03-19  8:30 fib{,6}: questions and proposals Robin Jarry
@ 2024-03-19 17:16 ` Medvedkin, Vladimir
  2024-03-19 20:38   ` Robin Jarry
  0 siblings, 1 reply; 5+ messages in thread
From: Medvedkin, Vladimir @ 2024-03-19 17:16 UTC (permalink / raw)
  To: Robin Jarry; +Cc: dev, Bruce Richardson

Hi Robin,

On 19/03/2024 08:30, Robin Jarry wrote:
> Hi Vladimir,
>
> I have been using rte_fib for a while and stumbled upon a few quirks. 
> I was wondering if you would answer some questions:
>
> 1) Is it OK/safe to share the same fib to perform route lookups from   
> multiple lcores in parallel? So far my observations seem to validate   
> that assumption but I would like your opinion :)
Yes, 100% :)
>
> 2) Is it OK/safe to modify a fib from a control thread (read/write)   
> while it is used by data path threads (read only)?

This part is a bit more complicated. In practice, I would say yes, 
however, there is a possibility that if the lookup thread is preempted 
in the middle of the lookup process, and at the same time the control 
thread deletes the corresponding route, then the lookup result may 
return outdated data. This problem is solved in LPM with RCU enabled. I 
have plans to implement it in the near future in the FIB.

>
> 3) There is no public API to list/walk all configured routes in a fib. 
>   Would that be possible/easy to implement?

Yes, it already there. FIB under the hood uses rte_rib to hold existing 
routes. So walking through can be implemented like:

struct rte_fib fib;

....

struct rte_rib rib = rte_fib_get_rib(fib);

struct rte_rib_node *cur = NULL;

do {

cur = rte_rib_get_nxt(rib, RTE_IPV4(0,0,0,0) /*this is supernet where 
you'd like to iterate*/, 0 /*and this is depth*/, cur, RTE_RIB_GET_NXT_ALL);

if (cur)

     printf...

} while (cur)


>
> 4) In rte_fib, every IPv4 address (route *and* next hop) needs to be 
> in   host order. This is not consistent with fib6 where addresses are 
>   stored in network order. It took me quite a while to figure out what 
>   was wrong with my code.
>
>   I assume this is because DIR24 needs host order integers and not   
> TRIE. Why was this not hidden in the API?
>
>   Could we add a flag to rte_fib_conf to change the behaviour? This   
> would avoid error prone ntohl/htonl juggling.

This API behavior was created in such a way that it is the same as LPM.

As for LPM, I think it was done this way for performance reasons because 
in some scenarios you only working with the host order ipv4 addresses.

>
> Thanks in advance for your replies :)
>
-- 
Regards,
Vladimir


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: fib{,6}: questions and proposals
  2024-03-19 17:16 ` Medvedkin, Vladimir
@ 2024-03-19 20:38   ` Robin Jarry
  2024-03-20  7:45     ` Morten Brørup
  2024-07-25 17:22     ` Medvedkin, Vladimir
  0 siblings, 2 replies; 5+ messages in thread
From: Robin Jarry @ 2024-03-19 20:38 UTC (permalink / raw)
  To: Medvedkin, Vladimir; +Cc: dev, Bruce Richardson

Hi Vladimir,

Medvedkin, Vladimir, Mar 19, 2024 at 18:16:
> > 2) Is it OK/safe to modify a fib from a control thread (read/write) 
> >    while it is used by data path threads (read only)?
>
> This part is a bit more complicated. In practice, I would say yes, 
> however, there is a possibility that if the lookup thread is preempted 
> in the middle of the lookup process, and at the same time the control 
> thread deletes the corresponding route, then the lookup result may 
> return outdated data. This problem is solved in LPM with RCU enabled. 
> I have plans to implement it in the near future in the FIB.

OK that's good to know, thanks.

> > 3) There is no public API to list/walk all configured routes in 
> >    a fib. Would that be possible/easy to implement?
>
> Yes, it already there. FIB under the hood uses rte_rib to hold 
> existing routes. So walking through can be implemented like:

I had tried it and got confusing results out of this. This must have 
been before I had realized that all addresses needed to be in host 
order...

I tried again and it works as advertised with a small missing detail: 
after configuring a default route, e.g.:

    rte_fib_add(fib, RTE_IPV4(2, 2, 0, 0), 16, RTE_IPV4(1, 2, 3, 4));
    rte_fib_add(fib, RTE_IPV4(3, 3, 3, 0), 24, RTE_IPV4(4, 3, 2, 1));
    rte_fib_add(fib, RTE_IPV4(0, 0, 0, 0), 0, RTE_IPV4(9, 9, 9, 9));

It is not returned by rte_rib_get_nxt() successive calls. I only see the 
other two routes:

    2.2.0.0/16 via 1.2.3.4
    3.3.3.0/24 via 4.3.2.1

Is this expected?

> > 4) In rte_fib, every IPv4 address (route *and* next hop) needs to be 
> >    in host order. This is not consistent with fib6 where addresses 
> >    are stored in network order. It took me quite a while to figure 
> >    out what was wrong with my code. 
>
> This API behavior was created in such a way that it is the same as 
> LPM.
>
> As for LPM, I think it was done this way for performance reasons 
> because in some scenarios you only working with the host order ipv4 
> addresses.

This should really be advertised in strong capital letters in the API 
docs. Or (preferably) hidden to the user. I don't see any valid scenario 
where you would work with host order IPv4 addresses.

Do you think we could change that API or at least add a flag at FIB/RIB 
creation to make it transparent to the user and consistent between IPv4 
and IPv6?

Thanks!


^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: fib{,6}: questions and proposals
  2024-03-19 20:38   ` Robin Jarry
@ 2024-03-20  7:45     ` Morten Brørup
  2024-07-25 17:22     ` Medvedkin, Vladimir
  1 sibling, 0 replies; 5+ messages in thread
From: Morten Brørup @ 2024-03-20  7:45 UTC (permalink / raw)
  To: Robin Jarry, Medvedkin, Vladimir; +Cc: dev, Bruce Richardson

> From: Robin Jarry [mailto:rjarry@redhat.com]
> Sent: Tuesday, 19 March 2024 21.39
> 
> Hi Vladimir,
> 
> Medvedkin, Vladimir, Mar 19, 2024 at 18:16:

[...]

> > > 4) In rte_fib, every IPv4 address (route *and* next hop) needs to be
> > >    in host order. This is not consistent with fib6 where addresses
> > >    are stored in network order. It took me quite a while to figure
> > >    out what was wrong with my code.
> >
> > This API behavior was created in such a way that it is the same as
> > LPM.
> >
> > As for LPM, I think it was done this way for performance reasons
> > because in some scenarios you only working with the host order ipv4
> > addresses.
> 
> This should really be advertised in strong capital letters in the API
> docs. Or (preferably) hidden to the user. I don't see any valid scenario
> where you would work with host order IPv4 addresses.
> 
> Do you think we could change that API or at least add a flag at FIB/RIB
> creation to make it transparent to the user and consistent between IPv4
> and IPv6?

I agree that it's weird and inconsistent to work with IPv6 addrs in network order, and not do the same for IPv4 addrs.
We should treat IPv4 addrs like IPv6 addrs, instead of dragging around pre-IPv6 legacy host endian IPv4 addresses.
Using a mix of network order and host order for IPv4 addrs is likely to cause bugs.
I would love to see that fixed across all of DPDK, but I suppose API breakage prevents it. :-(


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: fib{,6}: questions and proposals
  2024-03-19 20:38   ` Robin Jarry
  2024-03-20  7:45     ` Morten Brørup
@ 2024-07-25 17:22     ` Medvedkin, Vladimir
  1 sibling, 0 replies; 5+ messages in thread
From: Medvedkin, Vladimir @ 2024-07-25 17:22 UTC (permalink / raw)
  To: Robin Jarry; +Cc: dev, Bruce Richardson

[-- Attachment #1: Type: text/plain, Size: 3397 bytes --]

Hi Robin,

Apologies for the delayed response

On 19/03/2024 20:38, Robin Jarry wrote:
> Hi Vladimir,
>
> Medvedkin, Vladimir, Mar 19, 2024 at 18:16:
>> > 2) Is it OK/safe to modify a fib from a control thread (read/write) 
>> >    while it is used by data path threads (read only)?
>>
>> This part is a bit more complicated. In practice, I would say yes, 
>> however, there is a possibility that if the lookup thread is 
>> preempted in the middle of the lookup process, and at the same time 
>> the control thread deletes the corresponding route, then the lookup 
>> result may return outdated data. This problem is solved in LPM with 
>> RCU enabled. I have plans to implement it in the near future in the FIB.
>
> OK that's good to know, thanks.
>
>> > 3) There is no public API to list/walk all configured routes in 
>> >    a fib. Would that be possible/easy to implement?
>>
>> Yes, it already there. FIB under the hood uses rte_rib to hold 
>> existing routes. So walking through can be implemented like:
>
> I had tried it and got confusing results out of this. This must have 
> been before I had realized that all addresses needed to be in host 
> order...
>
> I tried again and it works as advertised with a small missing detail: 
> after configuring a default route, e.g.:
>
>    rte_fib_add(fib, RTE_IPV4(2, 2, 0, 0), 16, RTE_IPV4(1, 2, 3, 4));
>    rte_fib_add(fib, RTE_IPV4(3, 3, 3, 0), 24, RTE_IPV4(4, 3, 2, 1));
>    rte_fib_add(fib, RTE_IPV4(0, 0, 0, 0), 0, RTE_IPV4(9, 9, 9, 9));
>
> It is not returned by rte_rib_get_nxt() successive calls. I only see 
> the other two routes:
>
>    2.2.0.0/16 via 1.2.3.4
>    3.3.3.0/24 via 4.3.2.1
>
> Is this expected?

Yes, it is expected. It is also reflected in API: "Retrieve next more 
specific prefix ...". So, in your case you should explicitly lookup 0/0 
route.

IfindthismoreconvenientfordataplanestructureslikeDIR24-8,whereIneedto 
findgaps forsomegivensuperprefix.

>
>> > 4) In rte_fib, every IPv4 address (route *and* next hop) needs to 
>> be >    in host order. This is not consistent with fib6 where 
>> addresses >    are stored in network order. It took me quite a while 
>> to figure >    out what was wrong with my code.
>> This API behavior was created in such a way that it is the same as LPM.
>>
>> As for LPM, I think it was done this way for performance reasons 
>> because in some scenarios you only working with the host order ipv4 
>> addresses.
>
> This should really be advertised in strong capital letters in the API 
> docs. Or (preferably) hidden to the user. I don't see any valid 
> scenario where you would work with host order IPv4 addresses.
I just implemented lookup the same way as LPM. As for valid scenario, 
years ago I used an LPM/FIB lookup on a huge text log file(it was nginx 
logs if I remember correctly) with hundreds of million lines with IP 
addresses to resolve corresponding AS numbers for some statistics. The 
macro I used converted substrings with IPv4 into unsigned integers in 
host byte order. So, it is not always true that IPv4 are in network byte 
order.
>
> Do you think we could change that API or at least add a flag at 
> FIB/RIB creation to make it transparent to the user and consistent 
> between IPv4 and IPv6?

Yes, I will add FIB configuration option to allow BE IPv4 as an input 
for lookup function.

>
> Thanks!
>
-- 
Regards,
Vladimir

[-- Attachment #2: Type: text/html, Size: 8120 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-07-25 17:30 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-19  8:30 fib{,6}: questions and proposals Robin Jarry
2024-03-19 17:16 ` Medvedkin, Vladimir
2024-03-19 20:38   ` Robin Jarry
2024-03-20  7:45     ` Morten Brørup
2024-07-25 17:22     ` Medvedkin, Vladimir

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).