From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id 0B011F72 for ; Thu, 24 Nov 2016 12:58:22 +0100 (CET) Received: from int-mx14.intmail.prod.int.phx2.redhat.com (int-mx14.intmail.prod.int.phx2.redhat.com [10.5.11.27]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 40C027AE83; Thu, 24 Nov 2016 11:58:21 +0000 (UTC) Received: from ktraynor.remote.csb (vpn1-6-9.ams2.redhat.com [10.36.6.9]) by int-mx14.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id uAOBwIgn006481; Thu, 24 Nov 2016 06:58:18 -0500 To: Maxime Coquelin , yuanhan.liu@linux.intel.com, thomas.monjalon@6wind.com, john.mcnamara@intel.com, zhiyong.yang@intel.com, dev@dpdk.org References: <20161123210006.7113-1-maxime.coquelin@redhat.com> Cc: fbaudin@redhat.com From: Kevin Traynor X-Enigmail-Draft-Status: N1110 Organization: Red Hat Message-ID: Date: Thu, 24 Nov 2016 11:58:17 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.3.0 MIME-Version: 1.0 In-Reply-To: <20161123210006.7113-1-maxime.coquelin@redhat.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.68 on 10.5.11.27 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Thu, 24 Nov 2016 11:58:21 +0000 (UTC) Subject: Re: [dpdk-dev] [PATCH] doc: introduce PVP reference benchmark X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Nov 2016 11:58:22 -0000 On 11/23/2016 09:00 PM, Maxime Coquelin wrote: > Having reference benchmarks is important in order to obtain > reproducible performance figures. > > This patch describes required steps to configure a PVP setup > using testpmd in both host and guest. > > Not relying on external vSwitch ease integration in a CI loop by > not being impacted by DPDK API changes. > > Signed-off-by: Maxime Coquelin A short template/hint of the main things to report after running could be useful to help ML discussions about results e.g. Traffic Generator: IXIA Acceptable Loss: 100% (i.e. raw throughput test) DPDK version/commit: v16.11 QEMU version/commit: v2.7.0 Patches applied: CPU: E5-2680 v3, 2.8GHz Result: x mpps NIC: ixgbe 82599 > --- > doc/guides/howto/img/pvp_2nics.svg | 556 +++++++++++++++++++++++++++ > doc/guides/howto/index.rst | 1 + > doc/guides/howto/pvp_reference_benchmark.rst | 389 +++++++++++++++++++ > 3 files changed, 946 insertions(+) > create mode 100644 doc/guides/howto/img/pvp_2nics.svg > create mode 100644 doc/guides/howto/pvp_reference_benchmark.rst > > +Host tuning > +~~~~~~~~~~~ I would add turbo boost =disabled on BIOS. > + > +#. Append these options to Kernel command line: > + > + .. code-block:: console > + > + intel_pstate=disable mce=ignore_ce default_hugepagesz=1G hugepagesz=1G hugepages=6 isolcpus=2-7 rcu_nocbs=2-7 nohz_full=2-7 iommu=pt intel_iommu=on > + > +#. Disable hyper-threads at runtime if necessary and BIOS not accessible: > + > + .. code-block:: console > + > + cat /sys/devices/system/cpu/cpu*[0-9]/topology/thread_siblings_list \ > + | sort | uniq \ > + | awk -F, '{system("echo 0 > /sys/devices/system/cpu/cpu"$2"/online")}' > + > +#. Disable NMIs: > + > + .. code-block:: console > + > + echo 0 > /proc/sys/kernel/nmi_watchdog > + > +#. Exclude isolated CPUs from the writeback cpumask: > + > + .. code-block:: console > + > + echo ffffff03 > /sys/bus/workqueue/devices/writeback/cpumask > + > +#. Isolate CPUs from IRQs: > + > + .. code-block:: console > + > + clear_mask=0xfc #Isolate CPU2 to CPU7 from IRQs > + for i in /proc/irq/*/smp_affinity > + do > + echo "obase=16;$(( 0x$(cat $i) & ~$clear_mask ))" | bc > $i > + done > + > +Qemu build > +~~~~~~~~~~ > + > + .. code-block:: console > + > + git clone git://dpdk.org/dpdk > + cd dpdk > + export RTE_SDK=$PWD > + make install T=x86_64-native-linuxapp-gcc DESTDIR=install > + > +DPDK build > +~~~~~~~~~~ > + > + .. code-block:: console > + > + git clone git://dpdk.org/dpdk > + cd dpdk > + export RTE_SDK=$PWD > + make install T=x86_64-native-linuxapp-gcc DESTDIR=install > + > +Testpmd launch > +~~~~~~~~~~~~~~ > + > +#. Assign NICs to DPDK: > + > + .. code-block:: console > + > + modprobe vfio-pci > + $RTE_SDK/install/sbin/dpdk-devbind -b vfio-pci 0000:11:00.0 0000:11:00.1 > + > +*Note: Sandy Bridge family seems to have some limitations wrt its IOMMU, > +giving poor performance results. To achieve good performance on these machines, > +consider using UIO instead.* > + > +#. Launch testpmd application: > + > + .. code-block:: console > + > + $RTE_SDK/install/bin/testpmd -l 0,2,3,4,5 --socket-mem=1024 -n 4 \ > + --vdev 'net_vhost0,iface=/tmp/vhost-user1' \ > + --vdev 'net_vhost1,iface=/tmp/vhost-user2' -- \ > + --portmask=f --disable-hw-vlan -i --rxq=1 --txq=1 > + --nb-cores=4 --forward-mode=io > + > +#. In testpmd interactive mode, set the portlist to obtin the right chaining: > + > + .. code-block:: console > + > + set portlist 0,2,1,3 > + start > + > +VM launch > +~~~~~~~~~ > + > +The VM may be launched ezither by calling directly QEMU, or by using libvirt. s/ezither/either > + > +#. Qemu way: > + > +Launch QEMU with two Virtio-net devices paired to the vhost-user sockets created by testpmd: > + > + .. code-block:: console > + > + /bin/x86_64-softmmu/qemu-system-x86_64 \ > + -enable-kvm -cpu host -m 3072 -smp 3 \ > + -chardev socket,id=char0,path=/tmp/vhost-user1 \ > + -netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce \ > + -device virtio-net-pci,netdev=mynet1,mac=52:54:00:02:d9:01,addr=0x10 \ > + -chardev socket,id=char1,path=/tmp/vhost-user2 \ > + -netdev type=vhost-user,id=mynet2,chardev=char1,vhostforce \ > + -device virtio-net-pci,netdev=mynet2,mac=52:54:00:02:d9:02,addr=0x11 \ > + -object memory-backend-file,id=mem,size=3072M,mem-path=/dev/hugepages,share=on \ > + -numa node,memdev=mem -mem-prealloc \ > + -net user,hostfwd=tcp::1002$1-:22 -net nic \ > + -qmp unix:/tmp/qmp.socket,server,nowait \ > + -monitor stdio .qcow2 Probably mergeable rx data path =off would want to be tested also when evaluating any performance improvements/regressions. > + > +You can use this qmp-vcpu-pin script to pin vCPUs: > + > + .. code-block:: python > + > + #!/usr/bin/python > + # QEMU vCPU pinning tool > + # > + # Copyright (C) 2016 Red Hat Inc. > + # > + # Authors: > + # Maxime Coquelin > + # > + # This work is licensed under the terms of the GNU GPL, version 2. See > + # the COPYING file in the top-level directory > + import argparse > + import json > + import os > + > + from subprocess import call > + from qmp import QEMUMonitorProtocol > + > + pinned = [] > + > + parser = argparse.ArgumentParser(description='Pin QEMU vCPUs to physical CPUs') > + parser.add_argument('-s', '--server', type=str, required=True, > + help='QMP server path or address:port') > + parser.add_argument('cpu', type=int, nargs='+', > + help='Physical CPUs IDs') > + args = parser.parse_args() > + > + devnull = open(os.devnull, 'w') > + > + srv = QEMUMonitorProtocol(args.server) > + srv.connect() > + > + for vcpu in srv.command('query-cpus'): > + vcpuid = vcpu['CPU'] > + tid = vcpu['thread_id'] > + if tid in pinned: > + print 'vCPU{}\'s tid {} already pinned, skipping'.format(vcpuid, tid) > + continue > + > + cpuid = args.cpu[vcpuid % len(args.cpu)] > + print 'Pin vCPU {} (tid {}) to physical CPU {}'.format(vcpuid, tid, cpuid) > + try: > + call(['taskset', '-pc', str(cpuid), str(tid)], stdout=devnull) > + pinned.append(tid) > + except OSError: > + print 'Failed to pin vCPU{} to CPU{}'.format(vcpuid, cpuid) > + > + > +That can be used this way, for example to pin 3 vCPUs to CPUs 1, 6 and 7: I think it would be good to explicitly explain the link you've made on core numbers in this case between isolcpus, the vCPU pinning above and the core list in the testpmd cmd line later. > + > + .. code-block:: console > + > + export PYTHONPATH=$PYTHONPATH:/scripts/qmp > + ./qmp-vcpu-pin -s /tmp/qmp.socket 1 6 7 > + > +#. Libvirt way: > + > +Some initial steps are required for libvirt to be able to connect to testpmd's > +sockets. > + > +First, SELinux policy needs to be set to permissiven, as testpmd is run as root > +(reboot required): > + > + .. code-block:: console > + > + cat /etc/selinux/config > + > + # This file controls the state of SELinux on the system. > + # SELINUX= can take one of these three values: > + # enforcing - SELinux security policy is enforced. > + # permissive - SELinux prints warnings instead of enforcing. > + # disabled - No SELinux policy is loaded. > + SELINUX=permissive > + # SELINUXTYPE= can take one of three two values: > + # targeted - Targeted processes are protected, > + # minimum - Modification of targeted policy. Only selected processes are protected. > + # mls - Multi Level Security protection. > + SELINUXTYPE=targeted > + > + > +Also, Qemu needs to be run as root, which has to be specified in /etc/libvirt/qemu.conf: > + > + .. code-block:: console > + > + user = "root" > + > +Once the domain created, following snippset is an extract of most important > +information (hugepages, vCPU pinning, Virtio PCI devices): > + > + .. code-block:: xml > + > + > + 3145728 > + 3145728 > + > + > + > + > + > + > + 3 > + > + > + > + > + > + > + > + > + > + > + hvm > + > + > + > + > + > + > + > + > + > + > + > + > + > + > +
> + > + > + > + > + > + > +
> + > + > + > + > +Guest setup > +........... > + > +Guest tuning > +~~~~~~~~~~~~ > + > +#. Append these options to Kernel command line: > + > + .. code-block:: console > + > + default_hugepagesz=1G hugepagesz=1G hugepages=1 intel_iommu=on iommu=pt isolcpus=1,2 rcu_nocbs=1,2 nohz_full=1,2 > + > +#. Disable NMIs: > + > + .. code-block:: console > + > + echo 0 > /proc/sys/kernel/nmi_watchdog > + > +#. Exclude isolated CPU1 and CPU2 from the writeback wq cpumask: > + > + .. code-block:: console > + > + echo 1 > /sys/bus/workqueue/devices/writeback/cpumask > + > +#. Isolate CPUs from IRQs: > + > + .. code-block:: console > + > + clear_mask=0x6 #Isolate CPU1 and CPU2 from IRQs > + for i in /proc/irq/*/smp_affinity > + do > + echo "obase=16;$(( 0x$(cat $i) & ~$clear_mask ))" | bc > $i > + done > + > +DPDK build > +~~~~~~~~~~ > + > + .. code-block:: console > + > + git clone git://dpdk.org/dpdk > + cd dpdk > + export RTE_SDK=$PWD > + make install T=x86_64-native-linuxapp-gcc DESTDIR=install > + > +Testpmd launch > +~~~~~~~~~~~~~~ > + > +Probe vfio module without iommu: > + > + .. code-block:: console > + > + modprobe -r vfio_iommu_type1 > + modprobe -r vfio > + modprobe vfio enable_unsafe_noiommu_mode=1 > + cat /sys/module/vfio/parameters/enable_unsafe_noiommu_mode > + modprobe vfio-pci > + > +Bind virtio-net devices to DPDK: > + > + .. code-block:: console > + > + $RTE_SDK/tools/dpdk-devbind.py -b vfio-pci 0000:00:10.0 0000:00:11.0 > + > +Start testpmd: > + > + .. code-block:: console > + > + $RTE_SDK/install/bin/testpmd -l 0,1,2 --socket-mem 1024 -n 4 \ > + --proc-type auto --file-prefix pg -- \ > + --portmask=3 --forward-mode=macswap --port-topology=chained \ > + --disable-hw-vlan --disable-rss -i --rxq=1 --txq=1 \ > + --rxd=256 --txd=256 --nb-cores=2 --auto-start >