From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id C6DE9A0350 for ; Fri, 26 Jun 2020 11:40:06 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 2FF761BEAA; Fri, 26 Jun 2020 11:40:06 +0200 (CEST) Received: from mail-wr1-f51.google.com (mail-wr1-f51.google.com [209.85.221.51]) by dpdk.org (Postfix) with ESMTP id 02F411BEA9 for ; Fri, 26 Jun 2020 11:40:04 +0200 (CEST) Received: by mail-wr1-f51.google.com with SMTP id g18so8868896wrm.2 for ; Fri, 26 Jun 2020 02:40:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=euZ2XfD3clwomrtEQ5FdSP71Z4H844EVSnr+BthPXBs=; b=GjM13WHg1u965dJNtBoYwEtvk6TZRE2xxRXlzq1JM/dA1C+JwVo2srXQo118u7VMmE fMwPf6U/yoEBwD9wAPKN6Es9xBPLsEParar84+WmBtbIBGwl5etUexFQiRe3IGyHb0jT i8jYqdAUn+3yvAT6Ae82NbVMqNGLis7jOGoFEjKaNyEFS0iUDruRk3H5KyShO0HUF+DF gCoiDq6/5+gaYRZSmkd1/WpeabNh7XmUpDYe88TndXOr2Y3sQGOVuCDkNp3jDFiNUAYB X/KUldWk6VKomOuWLHoR212eqx8iJg/TCrHWj8g0FxtidzN+J+bCVAjnTOooVOQCORMz 0fFA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=euZ2XfD3clwomrtEQ5FdSP71Z4H844EVSnr+BthPXBs=; b=Syf1i0J5NN9ACmzyBRXZ9o7I9BlNODDc9OrR95kij5VxUtqZwssfJnsLlUC1DBA5ak 04QL0B+2Dlh7GamSVDa7DoCcCDaIIhqqX+DRoB56ZsnbubVy5wfegLYHkneOhbvep8e4 AC551Gr+rN5lv6cx9d6H61u9zhwHGpQPAlyTf24Lartkp0XmyP0V8bIEGmnw4rpH+mlw qEA/Ynq4WUdUYxc67r14a5HLOtiapz0VxVbLPwiJ7GHWRtVqNeXm02eZwrb1VPoyuRKF CALQC5hMpBt5azxOR10p2HejswzdsjSOurXK/R5RAUm05FjYpjMuPNR9UIHJq99bDFHY NMyA== X-Gm-Message-State: AOAM530gumaIT3ZbsI+cKpJ/IrEdHfbuvC0NwKLh8bmqVdxLDIhWqMSM es3IgewifK0HbBO3ksQawncOJkuaDz0AQn1t65a4yip0hrc= X-Google-Smtp-Source: ABdhPJyr9n7fAuziM1/znQdUnDFar52wrgIhKsm89CKVBwqOPPSsHSXWsKJCQjKMYlISIx+QEOqfR25uSXZKh74SGuc= X-Received: by 2002:a5d:6a06:: with SMTP id m6mr2708208wru.321.1593164404423; Fri, 26 Jun 2020 02:40:04 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Vipul Ujawane Date: Fri, 26 Jun 2020 17:39:23 +0800 Message-ID: To: David Christensen Cc: users@dpdk.org Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Subject: Re: [dpdk-users] Poor performance when using OVS with DPDK X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: users-bounces@dpdk.org Sender: "users" On Fri, 2020-06-26 at 02:08 +0800, Vipul Ujawane wrote: > > > ---------- Forwarded message --------- > From: David Christensen > Date: Fri, Jun 26, 2020, 02:03 > Subject: Re: [dpdk-users] Poor performance when using OVS with DPDK > To: Vipul Ujawane , > > > > > On 6/24/20 4:03 AM, Vipul Ujawane wrote: > > Dear all, > > > > I am observing a very low performance when running OVS-DPDK when > compared > > to OVS running with the Kernel Datapath. > > I have OvS version 2.13.90 compiled from source with the latest > stable DPDK > > v19.11.3 on a stable Debian system running kernel 4.19.0-9-amd64 > (real > > version:4.19.118). > > > > I have tried to use the latest released OvS as well (2.12) with the > same > > LTS DPDK. As a last resort, I have tried an older kernel, whether > it has > > any problem (4.19.0-8-amd64 (real version:4.19.98)). > > > > I have not been able to troubleshoot the problem, and kindly > request your > > help regarding the same. > > > > HW configuration > > ================ > > We have to two totally identical servers (Debian stable, Intel(R) > Xeon(R) > > Gold 6230 CPU, 96G Mem), each runs KVM virtual machine. On the > hypervisor > > layer, we have OvS for traffic routing. The servers are connected > directly > > via a Mellanox ConnectX-5 (1x100G). > > OVS Forwarding tables are configured for simple port-forwarding > only to > > avoid any packet processing-related issue. > > > > Problem > > ======= > > When both servers are running OVS-Kernel at the hypervisor layer > and VMs > > are connected to it via libvirt and virtio interfaces, the > > VM->Server1->Server2->VM throughput is around 16-18Gbps. > > However, when using OVS-DPDK with the same setting, the throughput > drops > > down to 4-6Gbps. > > You don't mention the traffic profile. I assume 64 byte frames but > best > to be explicit. Sure, sorry about that! we did use iperf (MTU sized packets) and the gotten throughput was 4-6Gbps. > > > > SW/driver configurations: > > ================== > > DPDK > > ---- > > In config common_base, besides the defaults, I have enabled the > following > > extra drivers/features to be compiled/enabled. > > CONFIG_RTE_LIBRTE_MLX5_PMD=y > > CONFIG_RTE_LIBRTE_VHOST=y > > CONFIG_RTE_LIBRTE_VHOST_NUMA=y > > CONFIG_RTE_LIBRTE_PMD_VHOST=y > > CONFIG_RTE_VIRTIO_USER=n > > CONFIG_RTE_EAL_VFIO=y > > > > > > OVS > > --- > > $ovs-vswitchd --version > > ovs-vswitchd (Open vSwitch) 2.13.90 > > > > $sudo ovs-vsctl get Open_vSwitch . dpdk_initialized > > true > > > > $sudo ovs-vsctl get Open_vSwitch . dpdk_version > > "DPDK 19.11.3" > > > > OS settings > > ----------- > > $ lsb_release -a > > No LSB modules are available. > > Distributor ID: Debian > > Description: Debian GNU/Linux 10 (buster) > > Release: 10 > > Codename: buster > > > > > > $ cat /proc/cmdline > > BOOT_IMAGE=/vmlinuz-4.19.0-9-amd64 root=/dev/mapper/Volume0-debian- > -stable > > ro default_hugepagesz=1G hugepagesz=1G hugepages=16 intel_iommu=on > iommu=pt > > quiet > > Why don't you reserve any CPUs for OVS/DPDK or VM usage? All > published > performance white papers recommend settings for CPU isolation like > this > Mellanox DPDK performance report: > > https://fast.dpdk.org/doc/perf/DPDK_19_08_Mellanox_NIC_performance_report.pdf > > For their test system: > > isolcpus=24-47 intel_idle.max_cstate=0 processor.max_cstate=0 > intel_pstate=disable nohz_full=24-47 > rcu_nocbs=24-47 rcu_nocb_poll default_hugepagesz=1G hugepagesz=1G > hugepages=64 audit=0 > nosoftlockup > > Using the tuned service (CPU partitioning profile) make this process > easier: > > https://tuned-project.org/ > Nice tutorial, thanks for sharing. I have checked it and configured our server like this: isolcpus=12-19 intel_idle.max_cstate=0 processor.max_cstate=0 nohz_full=12-19 rcu_nocbs=12-19 intel_pstate=disable default_hugepagesz=1G hugepagesz=1G hugepages=24 audit=0 nosoftlockup intel_iommu=on iommu=pt rcu_nocb_poll Even though our servers are NUMA-capable and NUMA-aware, we only have one CPU installed in one socket. And one CPU has 20 physical cores (40 threads), so I figured out to use the "top-most" cores for DPDK/OVS, that's the reason of isolcpus=12-19 > > > > ./usertools/dpdk-devbind.py --status > > Network devices using kernel driver > > =================================== > > 0000:b3:00.0 'MT27800 Family [ConnectX-5] 1017' if=ens2 > drv=mlx5_core > > unused=igb_uio,vfio-pci > > > > Due to the way how Mellanox cards and their driver work, I have not > bond > > igb_uio to the interface, however, uio, igb_uio and vfio-pci kernel > modules > > are loaded. > > > > > > Relevant part of the VM-config for Qemu/KVM > > ------------------------------------------- > > > > 4096 > > > > > > Where did you get these CPU mapping values? x86 systems typically > map > even-numbered CPUs to one NUMA node and odd-numbered CPUs to a > different > NUMA node. You generally want to select CPUs from the same NUMA node > as > the mlx5 NIC you're using for DPDK. > > You should have at least 4 CPUs in the VM, selected according to the > NUMA topology of the system. as per my answer above, our system has no secondary NUMA node, all mappings are to the same socket/CPU. > > Take a look at this bash script written for Red Hat: > > https://github.com/ctrautma/RHEL_NIC_QUALIFICATION/blob/ansible/ansible/get_cpulist.sh > > It gives you a good starting reference which CPUs to select for the > OVS/DPDK and VM configurations on your particular system. Also > review > the Ansible script pvp_ovsdpdk.yml, it provides a lot of other > useful > steps you might be able to apply to your Debian OS. > > > > > > > > > > > > > > > > memAccess='shared'/> > > > > > > > > > > path='/usr/local/var/run/openvswitch/vhostuser' > > mo$ > > > > > > > > Is there a requirement for mergeable RX buffers? Some PMDs like > mlx5 > can take advantage of SSE instructions when this is disabled, > yielding > better performance. Good point, there is no requirement, I just took an example config and though it's necessary for the driver queues setting. > > > > >
> function='0x0'$ > > > > > > I don't see hugepage usage in the libvirt XML. Something similar to: > > 8388608 > 8388608 > > > > > I did not copy this part of the XML, but we have hugepages configured properly. > > > > ----------------------------------- > > OVS Start Config > > ----------------------------------- > > ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true > > ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket- > mem="4096,0" > > ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-lcore- > mask=0xff > > ovs-vsctl --no-wait set Open_vSwitch . other_config:pmd-cpu-mask=0e > > These two masks shouldn't overlap: > https://developers.redhat.com/blog/2017/06/28/ovs-dpdk-parameters-dealing-with-multi-numa/ > Thanks, this did really help me understand in which order these commands should be issued. So, the problem now is the following. I did all the changes you shared, and started OVS/DPDK in a proper way and set these features: ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket- mem="8192,0" ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-lcore- mask=0x01000 ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true and, finally this: ovs-vsctl --no-wait set Open_vSwitch . other_config:pmd-cpu- mask=0x0e000 The documentation you shared say this last one can be even set during runtime. So, I was playing with it to see there is any change. I did not start any VM on top of OVS/DPDK, just set up a port forward rule (in_port=1, actions=output:IN_PORT), since I only have one physical ports on each mellanox card. Then, I generated traffic from the other server towards OVS Using pktsize 64B, the max throughput Pktgen reports is 8Gbps. In particular, I got these metrics: Size Sent_pps Recv_pps Recv_Gbps 64B 93M 11M ~8 128B 65M 12.5M ~15 256B 42.5M 12.3M ~27 512B 23.5M 11.9M ~51 1024B 11.9M 10M ~83 1280B 9.6M 8.3M ~86 1500B 8.3M 6.7M ~82 It's quite interesting that for 64B, the pps is less than for greater sizes. Because PPS should be the practical limitation in throughput, and according to the packet size we can count the throughput in Gbps. Anyway, OVS-DPDK have 3 cores to use, but only one rx queue is assigned to the port (so, basically --- as `top` also shows --- it is the one- core performance. Increasing the cores did not help, and the performance remained the same. Is this performance normal for OVS/DPDK?