From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id 0689E5591 for ; Thu, 24 Nov 2016 13:39:39 +0100 (CET) Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 456C5624D3; Thu, 24 Nov 2016 12:39:39 +0000 (UTC) Received: from [10.36.5.18] (vpn1-5-18.ams2.redhat.com [10.36.5.18]) by int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id uAOCdZgw031826 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 24 Nov 2016 07:39:36 -0500 To: Kevin Traynor , yuanhan.liu@linux.intel.com, thomas.monjalon@6wind.com, john.mcnamara@intel.com, zhiyong.yang@intel.com, dev@dpdk.org References: <20161123210006.7113-1-maxime.coquelin@redhat.com> Cc: fbaudin@redhat.com From: Maxime Coquelin Message-ID: <4759dbda-efb2-4887-d663-5cb84d6a9241@redhat.com> Date: Thu, 24 Nov 2016 13:39:34 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.68 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Thu, 24 Nov 2016 12:39:39 +0000 (UTC) Subject: Re: [dpdk-dev] [PATCH] doc: introduce PVP reference benchmark X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Nov 2016 12:39:41 -0000 On 11/24/2016 12:58 PM, Kevin Traynor wrote: > On 11/23/2016 09:00 PM, Maxime Coquelin wrote: >> Having reference benchmarks is important in order to obtain >> reproducible performance figures. >> >> This patch describes required steps to configure a PVP setup >> using testpmd in both host and guest. >> >> Not relying on external vSwitch ease integration in a CI loop by >> not being impacted by DPDK API changes. >> >> Signed-off-by: Maxime Coquelin > > A short template/hint of the main things to report after running could > be useful to help ML discussions about results e.g. > > Traffic Generator: IXIA > Acceptable Loss: 100% (i.e. raw throughput test) > DPDK version/commit: v16.11 > QEMU version/commit: v2.7.0 > Patches applied: > CPU: E5-2680 v3, 2.8GHz > Result: x mpps > NIC: ixgbe 82599 Good idea, I'll add a section in the end providing this template. > >> --- >> doc/guides/howto/img/pvp_2nics.svg | 556 +++++++++++++++++++++++++++ >> doc/guides/howto/index.rst | 1 + >> doc/guides/howto/pvp_reference_benchmark.rst | 389 +++++++++++++++++++ >> 3 files changed, 946 insertions(+) >> create mode 100644 doc/guides/howto/img/pvp_2nics.svg >> create mode 100644 doc/guides/howto/pvp_reference_benchmark.rst >> > > > >> +Host tuning >> +~~~~~~~~~~~ > > I would add turbo boost =disabled on BIOS. > +1, will be in next revision. >> + >> +#. Append these options to Kernel command line: >> + >> + .. code-block:: console >> + >> + intel_pstate=disable mce=ignore_ce default_hugepagesz=1G hugepagesz=1G hugepages=6 isolcpus=2-7 rcu_nocbs=2-7 nohz_full=2-7 iommu=pt intel_iommu=on >> + >> +#. Disable hyper-threads at runtime if necessary and BIOS not accessible: >> + >> + .. code-block:: console >> + >> + cat /sys/devices/system/cpu/cpu*[0-9]/topology/thread_siblings_list \ >> + | sort | uniq \ >> + | awk -F, '{system("echo 0 > /sys/devices/system/cpu/cpu"$2"/online")}' >> + >> +#. Disable NMIs: >> + >> + .. code-block:: console >> + >> + echo 0 > /proc/sys/kernel/nmi_watchdog >> + >> +#. Exclude isolated CPUs from the writeback cpumask: >> + >> + .. code-block:: console >> + >> + echo ffffff03 > /sys/bus/workqueue/devices/writeback/cpumask >> + >> +#. Isolate CPUs from IRQs: >> + >> + .. code-block:: console >> + >> + clear_mask=0xfc #Isolate CPU2 to CPU7 from IRQs >> + for i in /proc/irq/*/smp_affinity >> + do >> + echo "obase=16;$(( 0x$(cat $i) & ~$clear_mask ))" | bc > $i >> + done >> + >> +Qemu build >> +~~~~~~~~~~ >> + >> + .. code-block:: console >> + >> + git clone git://dpdk.org/dpdk >> + cd dpdk >> + export RTE_SDK=$PWD >> + make install T=x86_64-native-linuxapp-gcc DESTDIR=install >> + >> +DPDK build >> +~~~~~~~~~~ >> + >> + .. code-block:: console >> + >> + git clone git://dpdk.org/dpdk >> + cd dpdk >> + export RTE_SDK=$PWD >> + make install T=x86_64-native-linuxapp-gcc DESTDIR=install >> + >> +Testpmd launch >> +~~~~~~~~~~~~~~ >> + >> +#. Assign NICs to DPDK: >> + >> + .. code-block:: console >> + >> + modprobe vfio-pci >> + $RTE_SDK/install/sbin/dpdk-devbind -b vfio-pci 0000:11:00.0 0000:11:00.1 >> + >> +*Note: Sandy Bridge family seems to have some limitations wrt its IOMMU, >> +giving poor performance results. To achieve good performance on these machines, >> +consider using UIO instead.* >> + >> +#. Launch testpmd application: >> + >> + .. code-block:: console >> + >> + $RTE_SDK/install/bin/testpmd -l 0,2,3,4,5 --socket-mem=1024 -n 4 \ >> + --vdev 'net_vhost0,iface=/tmp/vhost-user1' \ >> + --vdev 'net_vhost1,iface=/tmp/vhost-user2' -- \ >> + --portmask=f --disable-hw-vlan -i --rxq=1 --txq=1 >> + --nb-cores=4 --forward-mode=io >> + >> +#. In testpmd interactive mode, set the portlist to obtin the right chaining: >> + >> + .. code-block:: console >> + >> + set portlist 0,2,1,3 >> + start >> + >> +VM launch >> +~~~~~~~~~ >> + >> +The VM may be launched ezither by calling directly QEMU, or by using libvirt. > > s/ezither/either > >> + >> +#. Qemu way: >> + >> +Launch QEMU with two Virtio-net devices paired to the vhost-user sockets created by testpmd: >> + >> + .. code-block:: console >> + >> + /bin/x86_64-softmmu/qemu-system-x86_64 \ >> + -enable-kvm -cpu host -m 3072 -smp 3 \ >> + -chardev socket,id=char0,path=/tmp/vhost-user1 \ >> + -netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce \ >> + -device virtio-net-pci,netdev=mynet1,mac=52:54:00:02:d9:01,addr=0x10 \ >> + -chardev socket,id=char1,path=/tmp/vhost-user2 \ >> + -netdev type=vhost-user,id=mynet2,chardev=char1,vhostforce \ >> + -device virtio-net-pci,netdev=mynet2,mac=52:54:00:02:d9:02,addr=0x11 \ >> + -object memory-backend-file,id=mem,size=3072M,mem-path=/dev/hugepages,share=on \ >> + -numa node,memdev=mem -mem-prealloc \ >> + -net user,hostfwd=tcp::1002$1-:22 -net nic \ >> + -qmp unix:/tmp/qmp.socket,server,nowait \ >> + -monitor stdio .qcow2 > > Probably mergeable rx data path =off would want to be tested also when > evaluating any performance improvements/regressions. Right, I'll add few lines about this. > >> + >> +You can use this qmp-vcpu-pin script to pin vCPUs: >> + >> + .. code-block:: python >> + >> + #!/usr/bin/python >> + # QEMU vCPU pinning tool >> + # >> + # Copyright (C) 2016 Red Hat Inc. >> + # >> + # Authors: >> + # Maxime Coquelin >> + # >> + # This work is licensed under the terms of the GNU GPL, version 2. See >> + # the COPYING file in the top-level directory >> + import argparse >> + import json >> + import os >> + >> + from subprocess import call >> + from qmp import QEMUMonitorProtocol >> + >> + pinned = [] >> + >> + parser = argparse.ArgumentParser(description='Pin QEMU vCPUs to physical CPUs') >> + parser.add_argument('-s', '--server', type=str, required=True, >> + help='QMP server path or address:port') >> + parser.add_argument('cpu', type=int, nargs='+', >> + help='Physical CPUs IDs') >> + args = parser.parse_args() >> + >> + devnull = open(os.devnull, 'w') >> + >> + srv = QEMUMonitorProtocol(args.server) >> + srv.connect() >> + >> + for vcpu in srv.command('query-cpus'): >> + vcpuid = vcpu['CPU'] >> + tid = vcpu['thread_id'] >> + if tid in pinned: >> + print 'vCPU{}\'s tid {} already pinned, skipping'.format(vcpuid, tid) >> + continue >> + >> + cpuid = args.cpu[vcpuid % len(args.cpu)] >> + print 'Pin vCPU {} (tid {}) to physical CPU {}'.format(vcpuid, tid, cpuid) >> + try: >> + call(['taskset', '-pc', str(cpuid), str(tid)], stdout=devnull) >> + pinned.append(tid) >> + except OSError: >> + print 'Failed to pin vCPU{} to CPU{}'.format(vcpuid, cpuid) >> + >> + >> +That can be used this way, for example to pin 3 vCPUs to CPUs 1, 6 and 7: > > I think it would be good to explicitly explain the link you've made on > core numbers in this case between isolcpus, the vCPU pinning above and > the core list in the testpmd cmd line later. Yes. So vCPU0 doesn't run testpmd lcores, so doesn't need to be pinned to an isolated CPU. vCPU1&2 are used as lcores, so are pinned to isolated CPUs. I'll add this information to next version. Thanks, Maxime