* [dpdk-dev] LPM6 next hop size @ 2016-09-19 20:04 Shyam Sundar Govindaraj 2016-09-19 21:18 ` Nikita Kozlov 0 siblings, 1 reply; 12+ messages in thread From: Shyam Sundar Govindaraj @ 2016-09-19 20:04 UTC (permalink / raw) To: dev Hi In IPv4 lpm implementation, the next hop size is increased from 8-bit to 24-bit. Is there a plan to increase IPv6 lpm next hop size? Also, next hop size in this document needs to be updated, since it is not 1 byte anymore. http://dpdk.org/doc/guides/prog_guide/lpm_lib.html Thanks Shyam ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [dpdk-dev] LPM6 next hop size 2016-09-19 20:04 [dpdk-dev] LPM6 next hop size Shyam Sundar Govindaraj @ 2016-09-19 21:18 ` Nikita Kozlov 2016-09-19 21:22 ` Matthew Hall 0 siblings, 1 reply; 12+ messages in thread From: Nikita Kozlov @ 2016-09-19 21:18 UTC (permalink / raw) To: dev On 09/19/2016 22:04, Shyam Sundar Govindaraj wrote: > Hi > > In IPv4 lpm implementation, the next hop size is increased from 8-bit to 24-bit. Is there a plan to increase IPv6 lpm next hop size? > > Also, next hop size in this document needs to be updated, since it is not 1 byte anymore. > http://dpdk.org/doc/guides/prog_guide/lpm_lib.html > > Thanks > Shyam I have submitted a patch that, among other things, increase this size. But it needs some reviews: http://dpdk.org/dev/patchwork/patch/15295/ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [dpdk-dev] LPM6 next hop size 2016-09-19 21:18 ` Nikita Kozlov @ 2016-09-19 21:22 ` Matthew Hall 2016-09-20 20:11 ` Thomas Monjalon 0 siblings, 1 reply; 12+ messages in thread From: Matthew Hall @ 2016-09-19 21:22 UTC (permalink / raw) To: Nikita Kozlov; +Cc: dev On Mon, Sep 19, 2016 at 11:18:48PM +0200, Nikita Kozlov wrote: > I have submitted a patch that, among other things, increase this size. > But it needs some reviews: http://dpdk.org/dev/patchwork/patch/15295/ A whole ton of us submitted patches to fix LPM lately. But fixing LPM6 was deleted from the roadmap and never added in a future release. So we don't get any upstream support to put the patches together and add it in the official release. Matthew. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [dpdk-dev] LPM6 next hop size 2016-09-19 21:22 ` Matthew Hall @ 2016-09-20 20:11 ` Thomas Monjalon 2016-09-21 17:29 ` Matthew Hall 0 siblings, 1 reply; 12+ messages in thread From: Thomas Monjalon @ 2016-09-20 20:11 UTC (permalink / raw) To: Matthew Hall; +Cc: dev, Nikita Kozlov 2016-09-19 14:22, Matthew Hall: > On Mon, Sep 19, 2016 at 11:18:48PM +0200, Nikita Kozlov wrote: > > I have submitted a patch that, among other things, increase this size. > > But it needs some reviews: http://dpdk.org/dev/patchwork/patch/15295/ > > A whole ton of us submitted patches to fix LPM lately. > > But fixing LPM6 was deleted from the roadmap and never added in a future release. We do not need to have it in the roadmap for accepting a patch. > So we don't get any upstream support to put the patches together and add it in the official release. Please, will you help reviewing this patch? ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [dpdk-dev] LPM6 next hop size 2016-09-20 20:11 ` Thomas Monjalon @ 2016-09-21 17:29 ` Matthew Hall 2016-09-21 19:37 ` Thomas Monjalon ` (2 more replies) 0 siblings, 3 replies; 12+ messages in thread From: Matthew Hall @ 2016-09-21 17:29 UTC (permalink / raw) To: Thomas Monjalon; +Cc: dev, Nikita Kozlov On Tue, Sep 20, 2016 at 10:11:04PM +0200, Thomas Monjalon wrote: > Please, will you help reviewing this patch? Sure. 1. It adds a dependency on libbsd on Linux: bsd/sys/tree.h. Is this an expected dependency of DPDK already? I use it in my code but not sure it's expected for everybody else. 2. Some of the tabbing of the new code seems not right. Probably need to recheck the indentation settings. 3. The comment formatting of the new code is a bit odd. Continued lines in /* ... */ in DPDK should have * on them I think. 4. It also contains some vendor specific comments like "/* Vyatta change to use red-black tree */". That's probably not needed if it's going into normal DPDK. 5. It uses "malloc" instead of standard DPDK allocators. That's bad for me because I don't want to use libc malloc in my code. Only DPDK allocators and jemalloc. 6. I don't see any updates to the unit test stuff. Don't we need new tests for the changes to deletion and re-insertion of rules and the changes in the tbl8 allocation. 7. Some of us previous submitted code to expand LPM4 and LPM6 to 24 bit next hop but the current code is only doing 16 bit. This is not pleasant for me because I really need the 24-bit support. When can we make this happen? Matthew. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [dpdk-dev] LPM6 next hop size 2016-09-21 17:29 ` Matthew Hall @ 2016-09-21 19:37 ` Thomas Monjalon 2016-09-21 23:42 ` Stephen Hemminger 2016-09-22 20:32 ` Nikita Kozlov 2 siblings, 0 replies; 12+ messages in thread From: Thomas Monjalon @ 2016-09-21 19:37 UTC (permalink / raw) To: Matthew Hall, Nikita Kozlov; +Cc: dev 2016-09-21 10:29, Matthew Hall: > On Tue, Sep 20, 2016 at 10:11:04PM +0200, Thomas Monjalon wrote: > > Please, will you help reviewing this patch? > > Sure. > > 1. It adds a dependency on libbsd on Linux: bsd/sys/tree.h. Is this an > expected dependency of DPDK already? I use it in my code but not sure it's > expected for everybody else. No we don't use libbsd yet. Is there a clean way to avoid this new dependency? What are the pros/cons? [...] > 7. Some of us previous submitted code to expand LPM4 and LPM6 to 24 bit next > hop but the current code is only doing 16 bit. This is not pleasant for me > because I really need the 24-bit support. When can we make this happen? It will happen as soon as a patch is approved. Matthew, what happened to your own patch for 24-bit hop? Please do not hesitate to work together if it can accelerate landing this feature. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [dpdk-dev] LPM6 next hop size 2016-09-21 17:29 ` Matthew Hall 2016-09-21 19:37 ` Thomas Monjalon @ 2016-09-21 23:42 ` Stephen Hemminger 2016-09-22 1:50 ` Matthew Hall 2016-09-22 20:32 ` Nikita Kozlov 2 siblings, 1 reply; 12+ messages in thread From: Stephen Hemminger @ 2016-09-21 23:42 UTC (permalink / raw) To: Matthew Hall; +Cc: Thomas Monjalon, dev, Nikita Kozlov On Wed, 21 Sep 2016 10:29:05 -0700 Matthew Hall <mhall@mhcomputing.net> wrote: > 5. It uses "malloc" instead of standard DPDK allocators. That's bad for me > because I don't want to use libc malloc in my code. Only DPDK allocators and > jemalloc. This was intentional because rte_malloc comes out of huge page area and that resource is a critical resource. It could use rte_malloc() but that makes it more likely to break when doing Policy Based routing or VRF. The BSD tree was used because it was space/time efficient lookup and available on both BSD and Linux. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [dpdk-dev] LPM6 next hop size 2016-09-21 23:42 ` Stephen Hemminger @ 2016-09-22 1:50 ` Matthew Hall 2016-09-22 5:07 ` Stephen Hemminger 0 siblings, 1 reply; 12+ messages in thread From: Matthew Hall @ 2016-09-22 1:50 UTC (permalink / raw) To: Stephen Hemminger; +Cc: Thomas Monjalon, dev, Nikita Kozlov On Wed, Sep 21, 2016 at 04:42:05PM -0700, Stephen Hemminger wrote: > This was intentional because rte_malloc comes out of huge page area and that > resource is a critical resource. It could use rte_malloc() but that makes it > more likely to break when doing Policy Based routing or VRF. Can we get more clarity on why PBR or VRF would break it? The performance and fragmentation of the default glibc allocator are quite bad. So I am trying to avoid it in my app for example. Matthew. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [dpdk-dev] LPM6 next hop size 2016-09-22 1:50 ` Matthew Hall @ 2016-09-22 5:07 ` Stephen Hemminger 2016-09-22 17:55 ` Matthew Hall 0 siblings, 1 reply; 12+ messages in thread From: Stephen Hemminger @ 2016-09-22 5:07 UTC (permalink / raw) To: Matthew Hall; +Cc: Thomas Monjalon, dev, Nikita Kozlov If you have 2G of huge memory and one 16M routes then the rules start to kill an application. Since huge memory is unpageable (pinned) then it is limited. On Wed, Sep 21, 2016 at 6:50 PM, Matthew Hall <mhall@mhcomputing.net> wrote: > On Wed, Sep 21, 2016 at 04:42:05PM -0700, Stephen Hemminger wrote: > > This was intentional because rte_malloc comes out of huge page area and > that > > resource is a critical resource. It could use rte_malloc() but that > makes it > > more likely to break when doing Policy Based routing or VRF. > > Can we get more clarity on why PBR or VRF would break it? The performance > and > fragmentation of the default glibc allocator are quite bad. So I am trying > to > avoid it in my app for example. > > Matthew. > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [dpdk-dev] LPM6 next hop size 2016-09-22 5:07 ` Stephen Hemminger @ 2016-09-22 17:55 ` Matthew Hall 2016-09-22 18:09 ` Stephen Hemminger 0 siblings, 1 reply; 12+ messages in thread From: Matthew Hall @ 2016-09-22 17:55 UTC (permalink / raw) To: Stephen Hemminger; +Cc: Thomas Monjalon, dev, Nikita Kozlov On Wed, Sep 21, 2016 at 10:07:46PM -0700, Stephen Hemminger wrote: > If you have 2G of huge memory and one 16M routes then the rules start to > kill an application. > Since huge memory is unpageable (pinned) then it is limited. Won't paging out routes result in very poor network performance? ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [dpdk-dev] LPM6 next hop size 2016-09-22 17:55 ` Matthew Hall @ 2016-09-22 18:09 ` Stephen Hemminger 0 siblings, 0 replies; 12+ messages in thread From: Stephen Hemminger @ 2016-09-22 18:09 UTC (permalink / raw) To: Matthew Hall; +Cc: Thomas Monjalon, dev, Nikita Kozlov On Thu, 22 Sep 2016 10:55:43 -0700 Matthew Hall <mhall@mhcomputing.net> wrote: > On Wed, Sep 21, 2016 at 10:07:46PM -0700, Stephen Hemminger wrote: > > If you have 2G of huge memory and one 16M routes then the rules start to > > kill an application. > > Since huge memory is unpageable (pinned) then it is limited. > > Won't paging out routes result in very poor network performance? This is the rules not the LPM table itself. The rules are only used when computing new LPM after add/delete route. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [dpdk-dev] LPM6 next hop size 2016-09-21 17:29 ` Matthew Hall 2016-09-21 19:37 ` Thomas Monjalon 2016-09-21 23:42 ` Stephen Hemminger @ 2016-09-22 20:32 ` Nikita Kozlov 2 siblings, 0 replies; 12+ messages in thread From: Nikita Kozlov @ 2016-09-22 20:32 UTC (permalink / raw) To: Matthew Hall, Thomas Monjalon; +Cc: dev, Stephen Hemminger On 09/21/2016 19:29, Matthew Hall wrote: > On Tue, Sep 20, 2016 at 10:11:04PM +0200, Thomas Monjalon wrote: >> Please, will you help reviewing this patch? > Sure. Sorry for being late to reply, I was a bit busy. > > 1. It adds a dependency on libbsd on Linux: bsd/sys/tree.h. Is this an > expected dependency of DPDK already? I use it in my code but not sure it's > expected for everybody else. In the V1 I was just importing bsd/tree.h in DPDK sources but I was told that it was better to depend on libbsd on linux. It seemed a good choice, and the idea was taken from a previous patch submission from Stephen Hemminger. > > 2. Some of the tabbing of the new code seems not right. Probably need to > recheck the indentation settings. Ok, I will fix that. > > 3. The comment formatting of the new code is a bit odd. Continued lines in /* > ... */ in DPDK should have * on them I think. Ok, I will fix that also. > > 4. It also contains some vendor specific comments like "/* Vyatta change to > use red-black tree */". That's probably not needed if it's going into normal > DPDK. It was because I copy/pasted some code from another patch, I guess we can drop that also. > > 5. It uses "malloc" instead of standard DPDK allocators. That's bad for me > because I don't want to use libc malloc in my code. Only DPDK allocators and > jemalloc. Like Stephen has explained, the tree of rules is her for storing the inserted rules in the modified DIR-24-8 itself. So, you should no see impacts on the rte_lpm6_lookup function due to the malloc() since I assume that you are no doing any rte_lpm6_add/rte_lpm6_delete in your fastpath. > > 6. I don't see any updates to the unit test stuff. Don't we need new tests for > the changes to deletion and re-insertion of rules and the changes in the tbl8 > allocation. I have slightly modified the test11 from in test_lpm6.c (http://dpdk.org/dev/patchwork/patch/15294/).I have added some call to rte_lpm6_tbl8_count() to verify that the number of tbl8 is consistent with the previous implementation which was recreating the full DIR-24-8 on each rte_lpm6_delete calls. During my development I was also often comparing the memory dump of of the tbl8 and tbl24 tables. If it is needed I may resubmit the patches with a test which do some memcmp between a LPM6 table created with only rte_lpm6_add() calls and a LPM6 table that contains the same rules but had some rte_lpm6_delete() calls in addition of the adds. Do you feel that those 2 tests are enough or do you have some more tests to suggest ? > > 7. Some of us previous submitted code to expand LPM4 and LPM6 to 24 bit next > hop but the current code is only doing 16 bit. This is not pleasant for me > because I really need the 24-bit support. When can we make this happen? I have chosen to increase it to 16bit only because I wanted to keep the rte_lpm6_tbl_entry structure on 4 bytes. Actually the nexthop size is 21bit in the struct but it is returning only 16bits. I may re-submit a patch which permit to use the whole 21bit but the 16bit version seemed better to me, and it was enough for my needs. About increasing the size of the rte_lpm6_tbl_entry, I'm not sure it will come without performance impacts and I don't really have the time right now to test it carefully enough so any inputs are welcome on it. > > Matthew. The part of the patch on which I would have ideally wanted a good review is on the delete_step() and delete_expand_step(). I have the feeling that the code here may be improved in several way concerning the thread safety. You cannot expect to have atomicity like in LPM4 since you may have to modify several tbl8 one a delete, but if you do a lookup in your fast path during a rte_lpm6_delete call that is running in a different thread your lookup should fail gracefully. In my tests it is the case but I may have missed some corner case. ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2016-09-22 20:32 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-09-19 20:04 [dpdk-dev] LPM6 next hop size Shyam Sundar Govindaraj 2016-09-19 21:18 ` Nikita Kozlov 2016-09-19 21:22 ` Matthew Hall 2016-09-20 20:11 ` Thomas Monjalon 2016-09-21 17:29 ` Matthew Hall 2016-09-21 19:37 ` Thomas Monjalon 2016-09-21 23:42 ` Stephen Hemminger 2016-09-22 1:50 ` Matthew Hall 2016-09-22 5:07 ` Stephen Hemminger 2016-09-22 17:55 ` Matthew Hall 2016-09-22 18:09 ` Stephen Hemminger 2016-09-22 20:32 ` Nikita Kozlov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).