From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 777FBA04DB for ; Wed, 12 Aug 2020 20:16:09 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id C48771C0B2; Wed, 12 Aug 2020 20:16:08 +0200 (CEST) Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by dpdk.org (Postfix) with ESMTP id 04EC71C0B1 for ; Wed, 12 Aug 2020 20:16:06 +0200 (CEST) Received: from pps.filterd (m0098421.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 07CI2ZsI148648; Wed, 12 Aug 2020 14:16:06 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=to : cc : from : subject : message-id : date : mime-version : content-type : content-transfer-encoding; s=pp1; bh=dOZzoOGPPX5LgV3zv/ivgfbRiZo7xUfmg3FA99mJ8EI=; b=PN/iNlYhSnhq4Kgf4OMetEG4R5y38GnH2u68j1Lcs0y17uMRSSSzW4kwff/21X8EqI23 vjorUVZW4Satfy/umJSCRwN6oLc5zU9Hq6nH1ozvcGtS08PgPo6RlceMBqExO55d6KNg TTW/A/2I1Np7NVEm+m/5LAVphZWev8oZtuua6++DwOpzk7dIv/6E2EIw6mTzqvUEsPjS CWX+VNRvrKUvHAzdmwzDGO+eaRUDK+DZvVgFQ/AAn3BLspSjl8Y4TNBBABjWEirDBppt SfKvike/B8Voxy1rNILh+XnssgMDMMv2iaG5nv9JmlX+mIDJAg8+FFPlDCc44f2ZvjTC 6g== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 32v6r81wce-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Aug 2020 14:16:05 -0400 Received: from m0098421.ppops.net (m0098421.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 07CI2loT150102; Wed, 12 Aug 2020 14:16:05 -0400 Received: from ppma04dal.us.ibm.com (7a.29.35a9.ip4.static.sl-reverse.com [169.53.41.122]) by mx0a-001b2d01.pphosted.com with ESMTP id 32v6r81wc0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Aug 2020 14:16:05 -0400 Received: from pps.filterd (ppma04dal.us.ibm.com [127.0.0.1]) by ppma04dal.us.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 07CIB0Vd003105; Wed, 12 Aug 2020 18:16:04 GMT Received: from b01cxnp23034.gho.pok.ibm.com (b01cxnp23034.gho.pok.ibm.com [9.57.198.29]) by ppma04dal.us.ibm.com with ESMTP id 32skp9cghn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Aug 2020 18:16:04 +0000 Received: from b01ledav003.gho.pok.ibm.com (b01ledav003.gho.pok.ibm.com [9.57.199.108]) by b01cxnp23034.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 07CIG3wo37683534 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 12 Aug 2020 18:16:03 GMT Received: from b01ledav003.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 5E33CB2066; Wed, 12 Aug 2020 18:16:03 +0000 (GMT) Received: from b01ledav003.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C3311B2064; Wed, 12 Aug 2020 18:16:02 +0000 (GMT) Received: from Davids-MBP.randomparity.org (unknown [9.163.86.41]) by b01ledav003.gho.pok.ibm.com (Postfix) with ESMTP; Wed, 12 Aug 2020 18:16:02 +0000 (GMT) To: "users@dpdk.org" Cc: Maxime Coquelin , Chenbo Xia , Zhihong Wang From: David Christensen Message-ID: <6a498e5b-87f5-ad62-5876-c5a08386c3e3@linux.vnet.ibm.com> Date: Wed, 12 Aug 2020 11:16:02 -0700 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:68.0) Gecko/20100101 Thunderbird/68.11.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.235, 18.0.687 definitions=2020-08-12_13:2020-08-11, 2020-08-12 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 suspectscore=0 adultscore=0 malwarescore=0 impostorscore=0 priorityscore=1501 spamscore=0 clxscore=1011 bulkscore=0 phishscore=0 mlxscore=0 lowpriorityscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2008120114 Subject: [dpdk-users] Vhost PMD Performance Doesn't Scale as Expected X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: users-bounces@dpdk.org Sender: "users" I'm examining performance between two VMs connected with a vhost interface on DPDK 20.08 and testpmd. Each VM (client-0, server-0) has 4 VCPUs, 4 RX/TX queues per port, 4GB RAM, and runs 8 containers, each with an instance of qperf running the tcp_bw test. The configuration is targeting all CPU/memory activity for NUMA node 1. When I look at the cumulative throughput as I increase the number of qperf pairs I'm noticing that the performance doesn't appear to scale as I had hoped. Here's a table with some results: concurrent qperf pairs msg_size 1 2 4 8 8,192 12.74 Gb/s 21.68 Gb/s 27.89 Gb/s 30.94 Gb/s 16,384 13.84 Gb/s 24.06 Gb/s 28.51 Gb/s 30.47 Gb/s 32,768 16.13 Gb/s 24.49 Gb/s 28.89 Gb/s 30.23 Gb/s 65,536 16.19 Gb/s 22.53 Gb/s 29.79 Gb/s 30.46 Gb/s 131,072 15.37 Gb/s 23.89 Gb/s 29.65 Gb/s 30.88 Gb/s 262,144 14.73 Gb/s 22.97 Gb/s 29.54 Gb/s 31.28 Gb/s 524,288 14.62 Gb/s 23.39 Gb/s 28.70 Gb/s 30.98 Gb/s Can anyone suggest a possible configuration change that might improve performance or is this generally what is expected? I was expecting performance to nearly double as I move from 1 to 2 to 4 queues. Even single queue performance is below Intel's published performance results (see https://fast.dpdk.org/doc/perf/DPDK_20_05_Intel_virtio_performance_report.pdf), though I was unable to get the vhost-switch example application to run due to an mbuf allocation error for the i40e PMD and had to revert to the testpmd app. Configuration details below. Dave /proc/cmdline: -------------- BOOT_IMAGE=(hd0,gpt2)/vmlinuz-4.18.0-147.el8.x86_64 root=/dev/mapper/rhel-root ro intel_iommu=on iommu=pt default_hugepagesz=1G hugepagesz=1G hugepages=64 crashkernel=auto resume=/dev/mapper/rhel-swap rd.lvm.lv=rhel/root rd.lvm.lv=rhel/swap rhgb quiet =1 nohz=on nohz_full=8-15,24-31 rcu_nocbs=8-15,24-31 tuned.non_isolcpus=00ff00ff intel_pstate=disable nosoftlockup testpmd command-line: --------------------- ~/src/dpdk/build/app/dpdk-testpmd -l 7,24-31 -n 4 --no-pci --vdev 'net_vhost0,iface=/tmp/vhost-dpdk-server-0,dequeue-zero-copy=1,tso=1,queues=4' --vdev 'net_vhost1,iface=/tmp/vhost-dpdk-client-0,dequeue-zero-copy=1,tso=1,queues=4' -- -i --nb-cores=8 --numa --rxq=4 --txq=4 testpmd forwarding core mapping: -------------------------------- Start automatic packet forwarding io packet forwarding - ports=2 - cores=8 - streams=8 - NUMA support enabled, MP allocation mode: native Logical Core 24 (socket 1) forwards packets on 1 streams: RX P=0/Q=0 (socket 0) -> TX P=1/Q=0 (socket 0) peer=02:00:00:00:00:01 Logical Core 25 (socket 1) forwards packets on 1 streams: RX P=1/Q=0 (socket 0) -> TX P=0/Q=0 (socket 0) peer=02:00:00:00:00:00 Logical Core 26 (socket 1) forwards packets on 1 streams: RX P=0/Q=1 (socket 0) -> TX P=1/Q=1 (socket 0) peer=02:00:00:00:00:01 Logical Core 27 (socket 1) forwards packets on 1 streams: RX P=1/Q=1 (socket 0) -> TX P=0/Q=1 (socket 0) peer=02:00:00:00:00:00 Logical Core 28 (socket 1) forwards packets on 1 streams: RX P=0/Q=2 (socket 0) -> TX P=1/Q=2 (socket 0) peer=02:00:00:00:00:01 Logical Core 29 (socket 1) forwards packets on 1 streams: RX P=1/Q=2 (socket 0) -> TX P=0/Q=2 (socket 0) peer=02:00:00:00:00:00 Logical Core 30 (socket 1) forwards packets on 1 streams: RX P=0/Q=3 (socket 0) -> TX P=1/Q=3 (socket 0) peer=02:00:00:00:00:01 Logical Core 31 (socket 1) forwards packets on 1 streams: RX P=1/Q=3 (socket 0) -> TX P=0/Q=3 (socket 0) peer=02:00:00:00:00:00 io packet forwarding packets/burst=32 nb forwarding cores=8 - nb forwarding ports=2 port 0: RX queue number: 4 Tx queue number: 4 Rx offloads=0x0 Tx offloads=0x0 RX queue: 0 RX desc=0 - RX free threshold=0 RX threshold registers: pthresh=0 hthresh=0 wthresh=0 RX Offloads=0x0 TX queue: 0 TX desc=0 - TX free threshold=0 TX threshold registers: pthresh=0 hthresh=0 wthresh=0 TX offloads=0x0 - TX RS bit threshold=0 port 1: RX queue number: 4 Tx queue number: 4 Rx offloads=0x0 Tx offloads=0x0 RX queue: 0 RX desc=0 - RX free threshold=0 RX threshold registers: pthresh=0 hthresh=0 wthresh=0 RX Offloads=0x0 TX queue: 0 TX desc=0 - TX free threshold=0 TX threshold registers: pthresh=0 hthresh=0 wthresh=0 TX offloads=0x0 - TX RS bit threshold=0 lscpu: ------ Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 32 On-line CPU(s) list: 0-31 Thread(s) per core: 2 Core(s) per socket: 8 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 85 Model name: Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz Stepping: 4 CPU MHz: 2400.075 BogoMIPS: 4200.00 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 1024K L3 cache: 11264K NUMA node0 CPU(s): 0-7,16-23 NUMA node1 CPU(s): 8-15,24-31 server-0 libvirt XML: --------------------- ... 4194304 4194304 4 hvm ...
...