* Question about the sndbuf of the tap interface with vhost-net @ 2022-02-23 13:13 Harold Huang 2022-02-23 13:46 ` Harold Huang 0 siblings, 1 reply; 7+ messages in thread From: Harold Huang @ 2022-02-23 13:13 UTC (permalink / raw) To: users; +Cc: jasowang, Maxime Coquelin, Chenbo Xia I see in dpdk virtio-user driver, the TUNSETSNDBUF is initialized with INT_MAX, see: https://github.com/DPDK/dpdk/blob/main/drivers/net/virtio/virtio_user/vhost_kernel_tap.c#L169 It is ok because tap driver uses it to support tx baching, see this patch: https://github.com/torvalds/linux/commit/0a0be13b8fe2cac11da2063fb03f0f39359b3069 But in tun_xdp_one, napi is not supported and I want to user napi in tun_get_user to enable gro. As I result, I change the sndbuf to a value such as 212992 in /proc/sys/net/core/wmem_default. But the performance tested by iperf is greatly degraded, from 4.5 Gbps to 750Gbps per flow. I see the the iperf server consume 100% cpu core, which should be the bottleneck of the this test. The perf top result of iperf server cpu core is as follows: ''' Samples: 72 of event 'cycles', 4000 Hz, Event count (approx.): 22685278 lost: 0/0 drop: 0/0 Overhead Shared O Symbol 59.86% [kernel] [k] report_bug 20.66% [kernel] [k] module_find_bug 6.51% [kernel] [k] common_interrupt 2.82% [kernel] [k] __slab_free 1.48% [kernel] [k] copy_user_enhanced_fast_string 1.44% [kernel] [k] __skb_datagram_iter 1.42% [kernel] [k] notifier_call_chain 1.41% [kernel] [k] irq_work_run_list 1.41% [kernel] [k] update_irq_load_avg 1.41% [kernel] [k] task_tick_fair 1.41% [kernel] [k] cmp_ex_search 0.16% [kernel] [k] __ghes_peek_estatus.isra.12 0.02% [kernel] [k] acpi_os_read_memory 0.00% [kernel] [k] native_apic_mem_write ''' I am not clear about the test result. Can we change the sndbuf size in dpdk? Is any way to enable vhost_net to use napi without changing the tun kernel driver? ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Question about the sndbuf of the tap interface with vhost-net 2022-02-23 13:13 Question about the sndbuf of the tap interface with vhost-net Harold Huang @ 2022-02-23 13:46 ` Harold Huang 2022-02-24 3:23 ` Jason Wang 0 siblings, 1 reply; 7+ messages in thread From: Harold Huang @ 2022-02-23 13:46 UTC (permalink / raw) To: users; +Cc: jasowang, Maxime Coquelin, Chenbo Xia Sorry. The performance tested by iperf is degraded from 4.5 Gbps to 750Mbps per flow. Harold Huang <baymaxhuang@gmail.com> 于2022年2月23日周三 21:13写道: > > I see in dpdk virtio-user driver, the TUNSETSNDBUF is initialized with > INT_MAX, see: https://github.com/DPDK/dpdk/blob/main/drivers/net/virtio/virtio_user/vhost_kernel_tap.c#L169 > It is ok because tap driver uses it to support tx baching, see this > patch: https://github.com/torvalds/linux/commit/0a0be13b8fe2cac11da2063fb03f0f39359b3069 > > But in tun_xdp_one, napi is not supported and I want to user napi in > tun_get_user to enable gro. As I result, I change the sndbuf to a > value such as 212992 in /proc/sys/net/core/wmem_default. But the > performance tested by iperf is greatly degraded, from 4.5 Gbps to > 750Gbps per flow. I see the the iperf server consume 100% cpu core, > which should be the bottleneck of the this test. The perf top result > of iperf server cpu core is as follows: > > ''' > Samples: 72 of event 'cycles', 4000 Hz, Event count (approx.): > 22685278 lost: 0/0 drop: 0/0 > Overhead Shared O Symbol > 59.86% [kernel] [k] report_bug > 20.66% [kernel] [k] module_find_bug > 6.51% [kernel] [k] common_interrupt > 2.82% [kernel] [k] __slab_free > 1.48% [kernel] [k] copy_user_enhanced_fast_string > 1.44% [kernel] [k] __skb_datagram_iter > 1.42% [kernel] [k] notifier_call_chain > 1.41% [kernel] [k] irq_work_run_list > 1.41% [kernel] [k] update_irq_load_avg > 1.41% [kernel] [k] task_tick_fair > 1.41% [kernel] [k] cmp_ex_search > 0.16% [kernel] [k] __ghes_peek_estatus.isra.12 > 0.02% [kernel] [k] acpi_os_read_memory > 0.00% [kernel] [k] native_apic_mem_write > ''' > I am not clear about the test result. Can we change the sndbuf size in > dpdk? Is any way to enable vhost_net to use napi without changing the > tun kernel driver? ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Question about the sndbuf of the tap interface with vhost-net 2022-02-23 13:46 ` Harold Huang @ 2022-02-24 3:23 ` Jason Wang 2022-02-24 4:19 ` Harold Huang 0 siblings, 1 reply; 7+ messages in thread From: Jason Wang @ 2022-02-24 3:23 UTC (permalink / raw) To: Harold Huang; +Cc: users, Maxime Coquelin, Chenbo Xia, netdev Adding netdev. On Wed, Feb 23, 2022 at 9:46 PM Harold Huang <baymaxhuang@gmail.com> wrote: > > Sorry. The performance tested by iperf is degraded from 4.5 Gbps to > 750Mbps per flow. > > Harold Huang <baymaxhuang@gmail.com> 于2022年2月23日周三 21:13写道: > > > > I see in dpdk virtio-user driver, the TUNSETSNDBUF is initialized with > > INT_MAX, see: https://github.com/DPDK/dpdk/blob/main/drivers/net/virtio/virtio_user/vhost_kernel_tap.c#L169 Note that Linux use INT_MAX as default sndbuf for tuntap. > > It is ok because tap driver uses it to support tx baching, see this > > patch: https://github.com/torvalds/linux/commit/0a0be13b8fe2cac11da2063fb03f0f39359b3069 > > > > But in tun_xdp_one, napi is not supported and I want to user napi in > > tun_get_user to enable gro. NAPI is not enabled in this path, want to send a patch to do that? Btw, NAPI mode is used for kernel networking stack hardening at start, but it would be interesting to see if it helps for the performance. > > As I result, I change the sndbuf to a > > value such as 212992 in /proc/sys/net/core/wmem_default. Can you describe your setup in detail? Where did you run the iperf server and client and where did you change the wmem_default? > > But the > > performance tested by iperf is greatly degraded, from 4.5 Gbps to > > 750Gbps per flow. I see the the iperf server consume 100% cpu core, > > which should be the bottleneck of the this test. The perf top result > > of iperf server cpu core is as follows: > > > > ''' > > Samples: 72 of event 'cycles', 4000 Hz, Event count (approx.): > > 22685278 lost: 0/0 drop: 0/0 > > Overhead Shared O Symbol > > 59.86% [kernel] [k] report_bug > > 20.66% [kernel] [k] module_find_bug > > 6.51% [kernel] [k] common_interrupt > > 2.82% [kernel] [k] __slab_free > > 1.48% [kernel] [k] copy_user_enhanced_fast_string > > 1.44% [kernel] [k] __skb_datagram_iter > > 1.42% [kernel] [k] notifier_call_chain > > 1.41% [kernel] [k] irq_work_run_list > > 1.41% [kernel] [k] update_irq_load_avg > > 1.41% [kernel] [k] task_tick_fair > > 1.41% [kernel] [k] cmp_ex_search > > 0.16% [kernel] [k] __ghes_peek_estatus.isra.12 > > 0.02% [kernel] [k] acpi_os_read_memory > > 0.00% [kernel] [k] native_apic_mem_write > > ''' > > I am not clear about the test result. Can we change the sndbuf size in > > dpdk? Is any way to enable vhost_net to use napi without changing the > > tun kernel driver? You can do this by not using INT_MAX as sndbuf. Thanks > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Question about the sndbuf of the tap interface with vhost-net 2022-02-24 3:23 ` Jason Wang @ 2022-02-24 4:19 ` Harold Huang 2022-02-24 4:36 ` Harold Huang 2022-02-24 4:39 ` Jason Wang 0 siblings, 2 replies; 7+ messages in thread From: Harold Huang @ 2022-02-24 4:19 UTC (permalink / raw) To: Jason Wang; +Cc: users, Maxime Coquelin, Chenbo Xia, netdev Thanks for Jason's comments. Jason Wang <jasowang@redhat.com> 于2022年2月24日周四 11:23写道: > > Adding netdev. > > On Wed, Feb 23, 2022 at 9:46 PM Harold Huang <baymaxhuang@gmail.com> wrote: > > > > Sorry. The performance tested by iperf is degraded from 4.5 Gbps to > > 750Mbps per flow. > > > > Harold Huang <baymaxhuang@gmail.com> 于2022年2月23日周三 21:13写道: > > > > > > I see in dpdk virtio-user driver, the TUNSETSNDBUF is initialized with > > > INT_MAX, see: https://github.com/DPDK/dpdk/blob/main/drivers/net/virtio/virtio_user/vhost_kernel_tap.c#L169 > > Note that Linux use INT_MAX as default sndbuf for tuntap. > > > > It is ok because tap driver uses it to support tx baching, see this > > > patch: https://github.com/torvalds/linux/commit/0a0be13b8fe2cac11da2063fb03f0f39359b3069 > > > > > > But in tun_xdp_one, napi is not supported and I want to user napi in > > > tun_get_user to enable gro. > > NAPI is not enabled in this path, want to send a patch to do that? Yes, I have a patch in this path to enable NAPI and it greatly improves TCP stream performance, from 4.5Gbsp to 9.2 Gbps per flow. I will send it later for comments. > > Btw, NAPI mode is used for kernel networking stack hardening at start, > but it would be interesting to see if it helps for the performance. > > > > As I result, I change the sndbuf to a > > > value such as 212992 in /proc/sys/net/core/wmem_default. > > Can you describe your setup in detail? Where did you run the iperf > server and client and where did you change the wmem_default? I use dpdk-testpmd to test the vhost-net performance, such as: dpdk-testpmd -l 0-9 -n 4 --vdev=virtio_user0,path=/dev/vhost-net,queue_size=1024,mac=00:00:0a:00:00:02 -a 0000:06:00.1 -- -i --txd=1024 --rxd=1024 And I have changed the sndbuf in https://github.com/DPDK/dpdk/blob/main/drivers/net/virtio/virtio_user/vhost_kernel_tap.c#L169 to 212992, which is not INT_MAX anymore. I also enable NAPI in the tun module. The iperf server ran in the tap interface on the kernel side, which would receive TCP stream from dpdk-testpmd. But the performance is greatly degraded, from 4.5 Gbps to 750Mbps. I am confused about the perf result of the cpu core where iperf server ran, which has a serious bottleneck: 59.86% cpu on the report_bug and 20.66% on the module_find_bug. I use centos 8.2 with a native 4.18.0-193.el8.x86_64 kernel to test. > > > > But the > > > performance tested by iperf is greatly degraded, from 4.5 Gbps to > > > 750Gbps per flow. I see the the iperf server consume 100% cpu core, > > > which should be the bottleneck of the this test. The perf top result > > > of iperf server cpu core is as follows: > > > > > > ''' > > > Samples: 72 of event 'cycles', 4000 Hz, Event count (approx.): > > > 22685278 lost: 0/0 drop: 0/0 > > > Overhead Shared O Symbol > > > 59.86% [kernel] [k] report_bug > > > 20.66% [kernel] [k] module_find_bug > > > 6.51% [kernel] [k] common_interrupt > > > 2.82% [kernel] [k] __slab_free > > > 1.48% [kernel] [k] copy_user_enhanced_fast_string > > > 1.44% [kernel] [k] __skb_datagram_iter > > > 1.42% [kernel] [k] notifier_call_chain > > > 1.41% [kernel] [k] irq_work_run_list > > > 1.41% [kernel] [k] update_irq_load_avg > > > 1.41% [kernel] [k] task_tick_fair > > > 1.41% [kernel] [k] cmp_ex_search > > > 0.16% [kernel] [k] __ghes_peek_estatus.isra.12 > > > 0.02% [kernel] [k] acpi_os_read_memory > > > 0.00% [kernel] [k] native_apic_mem_write > > > ''' > > > I am not clear about the test result. Can we change the sndbuf size in > > > dpdk? Is any way to enable vhost_net to use napi without changing the > > > tun kernel driver? > > You can do this by not using INT_MAX as sndbuf. Just mentioned above, I change the sndbuf value and I met a serious performance degradation. > > Thanks > > > > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Question about the sndbuf of the tap interface with vhost-net 2022-02-24 4:19 ` Harold Huang @ 2022-02-24 4:36 ` Harold Huang 2022-02-24 4:39 ` Jason Wang 1 sibling, 0 replies; 7+ messages in thread From: Harold Huang @ 2022-02-24 4:36 UTC (permalink / raw) To: Jason Wang; +Cc: users, Maxime Coquelin, Chenbo Xia, netdev Harold Huang <baymaxhuang@gmail.com> 于2022年2月24日周四 12:19写道: > > Thanks for Jason's comments. > > Jason Wang <jasowang@redhat.com> 于2022年2月24日周四 11:23写道: > > > > Adding netdev. > > > > On Wed, Feb 23, 2022 at 9:46 PM Harold Huang <baymaxhuang@gmail.com> wrote: > > > > > > Sorry. The performance tested by iperf is degraded from 4.5 Gbps to > > > 750Mbps per flow. > > > > > > Harold Huang <baymaxhuang@gmail.com> 于2022年2月23日周三 21:13写道: > > > > > > > > I see in dpdk virtio-user driver, the TUNSETSNDBUF is initialized with > > > > INT_MAX, see: https://github.com/DPDK/dpdk/blob/main/drivers/net/virtio/virtio_user/vhost_kernel_tap.c#L169 > > > > Note that Linux use INT_MAX as default sndbuf for tuntap. > > > > > > It is ok because tap driver uses it to support tx baching, see this > > > > patch: https://github.com/torvalds/linux/commit/0a0be13b8fe2cac11da2063fb03f0f39359b3069 > > > > > > > > But in tun_xdp_one, napi is not supported and I want to user napi in > > > > tun_get_user to enable gro. > > > > NAPI is not enabled in this path, want to send a patch to do that? > > Yes, I have a patch in this path to enable NAPI and it greatly > improves TCP stream performance, from 4.5Gbsp to 9.2 Gbps per flow. I > will send it later for comments. > > > > > Btw, NAPI mode is used for kernel networking stack hardening at start, > > but it would be interesting to see if it helps for the performance. > > > > > > As I result, I change the sndbuf to a > > > > value such as 212992 in /proc/sys/net/core/wmem_default. > > > > Can you describe your setup in detail? Where did you run the iperf > > server and client and where did you change the wmem_default? > > I use dpdk-testpmd to test the vhost-net performance, such as: > dpdk-testpmd -l 0-9 -n 4 > --vdev=virtio_user0,path=/dev/vhost-net,queue_size=1024,mac=00:00:0a:00:00:02 > -a 0000:06:00.1 -- -i --txd=1024 --rxd=1024 > > And I have changed the sndbuf in > https://github.com/DPDK/dpdk/blob/main/drivers/net/virtio/virtio_user/vhost_kernel_tap.c#L169 > to 212992, which is not INT_MAX anymore. I also enable NAPI in the tun > module. The iperf server ran in the tap interface on the kernel side, > which would receive TCP stream from dpdk-testpmd. But the performance > is greatly degraded, from 4.5 Gbps to 750Mbps. I am confused about > the perf result of the cpu core where iperf server ran, which has a > serious bottleneck: 59.86% cpu on the report_bug and 20.66% on the > module_find_bug. I use centos 8.2 with a native 4.18.0-193.el8.x86_64 > kernel to test. BTW, if I change sock_can_batch = false in https://github.com/torvalds/linux/blob/master/drivers/vhost/net.c#L782 directly and use the default sk.sk_sndbuf size, ie. INT_MAX, the test result seems ok. > > > > > > > But the > > > > performance tested by iperf is greatly degraded, from 4.5 Gbps to > > > > 750Gbps per flow. I see the the iperf server consume 100% cpu core, > > > > which should be the bottleneck of the this test. The perf top result > > > > of iperf server cpu core is as follows: > > > > > > > > ''' > > > > Samples: 72 of event 'cycles', 4000 Hz, Event count (approx.): > > > > 22685278 lost: 0/0 drop: 0/0 > > > > Overhead Shared O Symbol > > > > 59.86% [kernel] [k] report_bug > > > > 20.66% [kernel] [k] module_find_bug > > > > 6.51% [kernel] [k] common_interrupt > > > > 2.82% [kernel] [k] __slab_free > > > > 1.48% [kernel] [k] copy_user_enhanced_fast_string > > > > 1.44% [kernel] [k] __skb_datagram_iter > > > > 1.42% [kernel] [k] notifier_call_chain > > > > 1.41% [kernel] [k] irq_work_run_list > > > > 1.41% [kernel] [k] update_irq_load_avg > > > > 1.41% [kernel] [k] task_tick_fair > > > > 1.41% [kernel] [k] cmp_ex_search > > > > 0.16% [kernel] [k] __ghes_peek_estatus.isra.12 > > > > 0.02% [kernel] [k] acpi_os_read_memory > > > > 0.00% [kernel] [k] native_apic_mem_write > > > > ''' > > > > I am not clear about the test result. Can we change the sndbuf size in > > > > dpdk? Is any way to enable vhost_net to use napi without changing the > > > > tun kernel driver? > > > > You can do this by not using INT_MAX as sndbuf. > > Just mentioned above, I change the sndbuf value and I met a serious > performance degradation. > > > > > Thanks > > > > > > > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Question about the sndbuf of the tap interface with vhost-net 2022-02-24 4:19 ` Harold Huang 2022-02-24 4:36 ` Harold Huang @ 2022-02-24 4:39 ` Jason Wang 2022-02-24 7:31 ` Harold Huang 1 sibling, 1 reply; 7+ messages in thread From: Jason Wang @ 2022-02-24 4:39 UTC (permalink / raw) To: Harold Huang; +Cc: users, Maxime Coquelin, Chenbo Xia, netdev On Thu, Feb 24, 2022 at 12:19 PM Harold Huang <baymaxhuang@gmail.com> wrote: > > Thanks for Jason's comments. > > Jason Wang <jasowang@redhat.com> 于2022年2月24日周四 11:23写道: > > > > Adding netdev. > > > > On Wed, Feb 23, 2022 at 9:46 PM Harold Huang <baymaxhuang@gmail.com> wrote: > > > > > > Sorry. The performance tested by iperf is degraded from 4.5 Gbps to > > > 750Mbps per flow. > > > > > > Harold Huang <baymaxhuang@gmail.com> 于2022年2月23日周三 21:13写道: > > > > > > > > I see in dpdk virtio-user driver, the TUNSETSNDBUF is initialized with > > > > INT_MAX, see: https://github.com/DPDK/dpdk/blob/main/drivers/net/virtio/virtio_user/vhost_kernel_tap.c#L169 > > > > Note that Linux use INT_MAX as default sndbuf for tuntap. > > > > > > It is ok because tap driver uses it to support tx baching, see this > > > > patch: https://github.com/torvalds/linux/commit/0a0be13b8fe2cac11da2063fb03f0f39359b3069 > > > > > > > > But in tun_xdp_one, napi is not supported and I want to user napi in > > > > tun_get_user to enable gro. > > > > NAPI is not enabled in this path, want to send a patch to do that? > > Yes, I have a patch in this path to enable NAPI and it greatly > improves TCP stream performance, from 4.5Gbsp to 9.2 Gbps per flow. I > will send it later for comments. Good to know that. Have you compared it with non-NAPI mode? > > > > > Btw, NAPI mode is used for kernel networking stack hardening at start, > > but it would be interesting to see if it helps for the performance. > > > > > > As I result, I change the sndbuf to a > > > > value such as 212992 in /proc/sys/net/core/wmem_default. > > > > Can you describe your setup in detail? Where did you run the iperf > > server and client and where did you change the wmem_default? > > I use dpdk-testpmd to test the vhost-net performance, such as: > dpdk-testpmd -l 0-9 -n 4 > --vdev=virtio_user0,path=/dev/vhost-net,queue_size=1024,mac=00:00:0a:00:00:02 > -a 0000:06:00.1 -- -i --txd=1024 --rxd=1024 > > And I have changed the sndbuf in > https://github.com/DPDK/dpdk/blob/main/drivers/net/virtio/virtio_user/vhost_kernel_tap.c#L169 > to 212992, which is not INT_MAX anymore. I also enable NAPI in the tun > module. The iperf server ran in the tap interface on the kernel side, > which would receive TCP stream from dpdk-testpmd. You're do TCP stream testing among two TAP and using tesmpd to forward traffic? > But the performance > is greatly degraded, from 4.5 Gbps to 750Mbps. I am confused about > the perf result of the cpu core where iperf server ran, which has a > serious bottleneck: 59.86% cpu on the report_bug and 20.66% on the > module_find_bug. This looks odd, you may want to check your perf, I don't think module_find_bug() will run at datapath. >I use centos 8.2 with a native 4.18.0-193.el8.x86_64 > kernel to test. The kernel is kind of too old, I suggest to test recent kernel version. Thanks > > > > > > > But the > > > > performance tested by iperf is greatly degraded, from 4.5 Gbps to > > > > 750Gbps per flow. I see the the iperf server consume 100% cpu core, > > > > which should be the bottleneck of the this test. The perf top result > > > > of iperf server cpu core is as follows: > > > > > > > > ''' > > > > Samples: 72 of event 'cycles', 4000 Hz, Event count (approx.): > > > > 22685278 lost: 0/0 drop: 0/0 > > > > Overhead Shared O Symbol > > > > 59.86% [kernel] [k] report_bug > > > > 20.66% [kernel] [k] module_find_bug > > > > 6.51% [kernel] [k] common_interrupt > > > > 2.82% [kernel] [k] __slab_free > > > > 1.48% [kernel] [k] copy_user_enhanced_fast_string > > > > 1.44% [kernel] [k] __skb_datagram_iter > > > > 1.42% [kernel] [k] notifier_call_chain > > > > 1.41% [kernel] [k] irq_work_run_list > > > > 1.41% [kernel] [k] update_irq_load_avg > > > > 1.41% [kernel] [k] task_tick_fair > > > > 1.41% [kernel] [k] cmp_ex_search > > > > 0.16% [kernel] [k] __ghes_peek_estatus.isra.12 > > > > 0.02% [kernel] [k] acpi_os_read_memory > > > > 0.00% [kernel] [k] native_apic_mem_write > > > > ''' > > > > I am not clear about the test result. Can we change the sndbuf size in > > > > dpdk? Is any way to enable vhost_net to use napi without changing the > > > > tun kernel driver? > > > > You can do this by not using INT_MAX as sndbuf. > > Just mentioned above, I change the sndbuf value and I met a serious > performance degradation. > > > > > Thanks > > > > > > > > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Question about the sndbuf of the tap interface with vhost-net 2022-02-24 4:39 ` Jason Wang @ 2022-02-24 7:31 ` Harold Huang 0 siblings, 0 replies; 7+ messages in thread From: Harold Huang @ 2022-02-24 7:31 UTC (permalink / raw) To: Jason Wang; +Cc: users, Maxime Coquelin, Chenbo Xia, netdev Hi, Jason, Jason Wang <jasowang@redhat.com> 于2022年2月24日周四 12:40写道: > > On Thu, Feb 24, 2022 at 12:19 PM Harold Huang <baymaxhuang@gmail.com> wrote: > > > > Thanks for Jason's comments. > > > > Jason Wang <jasowang@redhat.com> 于2022年2月24日周四 11:23写道: > > > > > > Adding netdev. > > > > > > On Wed, Feb 23, 2022 at 9:46 PM Harold Huang <baymaxhuang@gmail.com> wrote: > > > > > > > > Sorry. The performance tested by iperf is degraded from 4.5 Gbps to > > > > 750Mbps per flow. > > > > > > > > Harold Huang <baymaxhuang@gmail.com> 于2022年2月23日周三 21:13写道: > > > > > > > > > > I see in dpdk virtio-user driver, the TUNSETSNDBUF is initialized with > > > > > INT_MAX, see: https://github.com/DPDK/dpdk/blob/main/drivers/net/virtio/virtio_user/vhost_kernel_tap.c#L169 > > > > > > Note that Linux use INT_MAX as default sndbuf for tuntap. > > > > > > > > It is ok because tap driver uses it to support tx baching, see this > > > > > patch: https://github.com/torvalds/linux/commit/0a0be13b8fe2cac11da2063fb03f0f39359b3069 > > > > > > > > > > But in tun_xdp_one, napi is not supported and I want to user napi in > > > > > tun_get_user to enable gro. > > > > > > NAPI is not enabled in this path, want to send a patch to do that? > > > > Yes, I have a patch in this path to enable NAPI and it greatly > > improves TCP stream performance, from 4.5Gbsp to 9.2 Gbps per flow. I > > will send it later for comments. > > Good to know that. > > Have you compared it with non-NAPI mode? Do you mean using netif_rx? If so, I have tested and the performance is about 5Gbps. The netif_rx calls process_backlog to process packet but it does not support GRO either. > > > > > > > > > Btw, NAPI mode is used for kernel networking stack hardening at start, > > > but it would be interesting to see if it helps for the performance. > > > > > > > > As I result, I change the sndbuf to a > > > > > value such as 212992 in /proc/sys/net/core/wmem_default. > > > > > > Can you describe your setup in detail? Where did you run the iperf > > > server and client and where did you change the wmem_default? > > > > I use dpdk-testpmd to test the vhost-net performance, such as: > > dpdk-testpmd -l 0-9 -n 4 > > --vdev=virtio_user0,path=/dev/vhost-net,queue_size=1024,mac=00:00:0a:00:00:02 > > -a 0000:06:00.1 -- -i --txd=1024 --rxd=1024 > > > > And I have changed the sndbuf in > > https://github.com/DPDK/dpdk/blob/main/drivers/net/virtio/virtio_user/vhost_kernel_tap.c#L169 > > to 212992, which is not INT_MAX anymore. I also enable NAPI in the tun > > module. The iperf server ran in the tap interface on the kernel side, > > which would receive TCP stream from dpdk-testpmd. > > You're do TCP stream testing among two TAP and using tesmpd to forward traffic? The test topology is as follow: ________________________ | | iperf-server-----tap<------->testpmd<------> ixgbe<----------->igxbe (iperf client) |_______________________| The testpmd is used to forward traffic from another machine. > > > But the performance > > is greatly degraded, from 4.5 Gbps to 750Mbps. I am confused about > > the perf result of the cpu core where iperf server ran, which has a > > serious bottleneck: 59.86% cpu on the report_bug and 20.66% on the > > module_find_bug. > > This looks odd, you may want to check your perf, I don't think > module_find_bug() will run at datapath. > > >I use centos 8.2 with a native 4.18.0-193.el8.x86_64 > > kernel to test. > > The kernel is kind of too old, I suggest to test recent kernel version. I will use a recent kernel to test it later. > > Thanks > > > > > > > > > > > But the > > > > > performance tested by iperf is greatly degraded, from 4.5 Gbps to > > > > > 750Gbps per flow. I see the the iperf server consume 100% cpu core, > > > > > which should be the bottleneck of the this test. The perf top result > > > > > of iperf server cpu core is as follows: > > > > > > > > > > ''' > > > > > Samples: 72 of event 'cycles', 4000 Hz, Event count (approx.): > > > > > 22685278 lost: 0/0 drop: 0/0 > > > > > Overhead Shared O Symbol > > > > > 59.86% [kernel] [k] report_bug > > > > > 20.66% [kernel] [k] module_find_bug > > > > > 6.51% [kernel] [k] common_interrupt > > > > > 2.82% [kernel] [k] __slab_free > > > > > 1.48% [kernel] [k] copy_user_enhanced_fast_string > > > > > 1.44% [kernel] [k] __skb_datagram_iter > > > > > 1.42% [kernel] [k] notifier_call_chain > > > > > 1.41% [kernel] [k] irq_work_run_list > > > > > 1.41% [kernel] [k] update_irq_load_avg > > > > > 1.41% [kernel] [k] task_tick_fair > > > > > 1.41% [kernel] [k] cmp_ex_search > > > > > 0.16% [kernel] [k] __ghes_peek_estatus.isra.12 > > > > > 0.02% [kernel] [k] acpi_os_read_memory > > > > > 0.00% [kernel] [k] native_apic_mem_write > > > > > ''' > > > > > I am not clear about the test result. Can we change the sndbuf size in > > > > > dpdk? Is any way to enable vhost_net to use napi without changing the > > > > > tun kernel driver? > > > > > > You can do this by not using INT_MAX as sndbuf. > > > > Just mentioned above, I change the sndbuf value and I met a serious > > performance degradation. > > > > > > > > Thanks > > > > > > > > > > > > > ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2022-02-24 7:31 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-02-23 13:13 Question about the sndbuf of the tap interface with vhost-net Harold Huang 2022-02-23 13:46 ` Harold Huang 2022-02-24 3:23 ` Jason Wang 2022-02-24 4:19 ` Harold Huang 2022-02-24 4:36 ` Harold Huang 2022-02-24 4:39 ` Jason Wang 2022-02-24 7:31 ` Harold Huang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).