From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from demumfd001.nsn-inter.net (demumfd001.nsn-inter.net [93.183.12.32]) by dpdk.org (Postfix) with ESMTP id 6E284C584 for ; Thu, 25 Jun 2015 11:14:55 +0200 (CEST) Received: from demuprx016.emea.nsn-intra.net ([10.150.129.55]) by demumfd001.nsn-inter.net (8.15.1/8.15.1) with ESMTPS id t5P9EsQY031708 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Thu, 25 Jun 2015 09:14:54 GMT Received: from DEMUHTC001.nsn-intra.net ([10.159.42.32]) by demuprx016.emea.nsn-intra.net (8.12.11.20060308/8.12.11) with ESMTP id t5P9EsdD007751 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=FAIL) for ; Thu, 25 Jun 2015 11:14:54 +0200 Received: from DEMUHTC013.nsn-intra.net (10.159.42.44) by DEMUHTC001.nsn-intra.net (10.159.42.32) with Microsoft SMTP Server (TLS) id 14.3.235.1; Thu, 25 Jun 2015 11:14:54 +0200 Received: from DEMUMBX003.nsn-intra.net ([169.254.2.58]) by DEMUHTC013.nsn-intra.net ([10.159.42.44]) with mapi id 14.03.0235.001; Thu, 25 Jun 2015 11:14:54 +0200 From: "Vass, Sandor (Nokia - HU/Budapest)" To: "dev@dpdk.org" Thread-Topic: VMXNET3 on vmware, ping delay Thread-Index: AdCvJ2To6UqiEGymSWSUJy1NN7xnKg== Date: Thu, 25 Jun 2015 09:14:53 +0000 Message-ID: <792CF0A6B0883C45AF8C719B2ECA946E42B2430F@DEMUMBX003.nsn-intra.net> Accept-Language: hu-HU, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.159.42.113] MIME-Version: 1.0 X-purgate-type: clean X-purgate-Ad: Categorized by eleven eXpurgate (R) http://www.eleven.de X-purgate: clean X-purgate: This mail is considered clean (visit http://www.eleven.de for further information) X-purgate-size: 20245 X-purgate-ID: 151667::1435223695-0000585F-1A509D1D/0/0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Subject: [dpdk-dev] VMXNET3 on vmware, ping delay X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Jun 2015 09:14:55 -0000 Hello, I would like to create an IP packet processor program and I choose to use D= PDK because it is promising wrt its speed aspect. I am trying to build a test environment to make the development a cheaper (= not to buy HW for each developer), so I created a test setup in - VMWare Workstation 11 - using DPDK 2.0.0 - with linux kernel 3.10.0, CentOS7 - gcc 4.8.3 - and standard, centos7 provided VMXNET3 driver, with uio_pci_generic kerne= l module (shall I use vmxnet3-usermap.ko with dpdk 2.0.0? Where is it, how could I c= ompile it?) I set up 3 machines: - set all machines' network interface type to VMXNET3 - set up one machine (C1) for issuing ping, its interface has an IP: 192.16= 8.3.21 - set up one machine (C2) for being the ping target, its interface has an I= P: 192.168.3.23 - set up one machine (BR) to act a L2 bridge using some of the examples pro= vided. DPDK is compiled properly, 256x 2MB hugetables created, example app= lication is executed and running without (major) error. - three machines are connected linearly: C1 - BR - C2 using two private ne= tworks on each side of BR (VMnet2 and VMnet3), so the VMs are connected by = vSwitches Ping reply arrives, definitely goes through BR (extra console logs), but th= ere are unexpected delays with example/skeleton/basicfwd... [root@localhost ~]# ping 192.168.3.23 PING 192.168.3.23 (192.168.3.23) 56(84) bytes of data. 64 bytes from 192.168.3.23: icmp_seq=3D1 ttl=3D64 time=3D1018 ms 64 bytes from 192.168.3.23: icmp_seq=3D2 ttl=3D64 time=3D18.7 ms 64 bytes from 192.168.3.23: icmp_seq=3D3 ttl=3D64 time=3D1008 ms 64 bytes from 192.168.3.23: icmp_seq=3D4 ttl=3D64 time=3D8.87 ms 64 bytes from 192.168.3.23: icmp_seq=3D5 ttl=3D64 time=3D1010 ms 64 bytes from 192.168.3.23: icmp_seq=3D6 ttl=3D64 time=3D10.2 ms 64 bytes from 192.168.3.23: icmp_seq=3D7 ttl=3D64 time=3D1012 ms 64 bytes from 192.168.3.23: icmp_seq=3D8 ttl=3D64 time=3D12.7 ms 64 bytes from 192.168.3.23: icmp_seq=3D9 ttl=3D64 time=3D1049 ms 64 bytes from 192.168.3.23: icmp_seq=3D10 ttl=3D64 time=3D49.8 ms 64 bytes from 192.168.3.23: icmp_seq=3D11 ttl=3D64 time=3D1008 ms 64 bytes from 192.168.3.23: icmp_seq=3D12 ttl=3D64 time=3D9.02 ms 64 bytes from 192.168.3.23: icmp_seq=3D13 ttl=3D64 time=3D1008 ms 64 bytes from 192.168.3.23: icmp_seq=3D14 ttl=3D64 time=3D8.74 ms 64 bytes from 192.168.3.23: icmp_seq=3D15 ttl=3D64 time=3D1007 ms 64 bytes from 192.168.3.23: icmp_seq=3D16 ttl=3D64 time=3D8.03 ms 64 bytes from 192.168.3.23: icmp_seq=3D17 ttl=3D64 time=3D1008 ms 64 bytes from 192.168.3.23: icmp_seq=3D18 ttl=3D64 time=3D8.96 ms 64 bytes from 192.168.3.23: icmp_seq=3D19 ttl=3D64 time=3D1008 ms 64 bytes from 192.168.3.23: icmp_seq=3D20 ttl=3D64 time=3D9.27 ms 64 bytes from 192.168.3.23: icmp_seq=3D21 ttl=3D64 time=3D1008 ms ... When I switched on BR to multi_process/client_server_mp, with 2 client proc= esses the result was almost the same: [root@localhost ~]# ping 192.168.3.23 PING 192.168.3.23 (192.168.3.23) 56(84) bytes of data. 64 bytes from 192.168.3.23: icmp_seq=3D1 ttl=3D64 time=3D1003 ms 64 bytes from 192.168.3.23: icmp_seq=3D2 ttl=3D64 time=3D3.50 ms 64 bytes from 192.168.3.23: icmp_seq=3D3 ttl=3D64 time=3D1002 ms 64 bytes from 192.168.3.23: icmp_seq=3D4 ttl=3D64 time=3D3.94 ms 64 bytes from 192.168.3.23: icmp_seq=3D5 ttl=3D64 time=3D1001 ms 64 bytes from 192.168.3.23: icmp_seq=3D6 ttl=3D64 time=3D1010 ms 64 bytes from 192.168.3.23: icmp_seq=3D7 ttl=3D64 time=3D1003 ms 64 bytes from 192.168.3.23: icmp_seq=3D8 ttl=3D64 time=3D2003 ms 64 bytes from 192.168.3.23: icmp_seq=3D10 ttl=3D64 time=3D2.29 ms 64 bytes from 192.168.3.23: icmp_seq=3D9 ttl=3D64 time=3D3002 ms 64 bytes from 192.168.3.23: icmp_seq=3D12 ttl=3D64 time=3D2.66 ms 64 bytes from 192.168.3.23: icmp_seq=3D11 ttl=3D64 time=3D3003 ms 64 bytes from 192.168.3.23: icmp_seq=3D14 ttl=3D64 time=3D2.87 ms 64 bytes from 192.168.3.23: icmp_seq=3D13 ttl=3D64 time=3D3003 ms 64 bytes from 192.168.3.23: icmp_seq=3D16 ttl=3D64 time=3D2.88 ms 64 bytes from 192.168.3.23: icmp_seq=3D15 ttl=3D64 time=3D1003 ms 64 bytes from 192.168.3.23: icmp_seq=3D17 ttl=3D64 time=3D1001 ms 64 bytes from 192.168.3.23: icmp_seq=3D18 ttl=3D64 time=3D2.70 ms ... And when I switched on BR to test-pdm, the ping result was kind of normal (= every commandline switch left as default) [root@localhost ~]# ping 192.168.3.23 PING 192.168.3.23 (192.168.3.23) 56(84) bytes of data. 64 bytes from 192.168.3.23: icmp_seq=3D1 ttl=3D64 time=3D3.52 ms 64 bytes from 192.168.3.23: icmp_seq=3D2 ttl=3D64 time=3D33.2 ms 64 bytes from 192.168.3.23: icmp_seq=3D3 ttl=3D64 time=3D3.97 ms 64 bytes from 192.168.3.23: icmp_seq=3D4 ttl=3D64 time=3D25.5 ms 64 bytes from 192.168.3.23: icmp_seq=3D5 ttl=3D64 time=3D61.1 ms 64 bytes from 192.168.3.23: icmp_seq=3D6 ttl=3D64 time=3D36.3 ms 64 bytes from 192.168.3.23: icmp_seq=3D7 ttl=3D64 time=3D35.5 ms 64 bytes from 192.168.3.23: icmp_seq=3D8 ttl=3D64 time=3D33.0 ms 64 bytes from 192.168.3.23: icmp_seq=3D9 ttl=3D64 time=3D5.32 ms 64 bytes from 192.168.3.23: icmp_seq=3D10 ttl=3D64 time=3D14.6 ms 64 bytes from 192.168.3.23: icmp_seq=3D11 ttl=3D64 time=3D34.5 ms 64 bytes from 192.168.3.23: icmp_seq=3D12 ttl=3D64 time=3D4.67 ms 64 bytes from 192.168.3.23: icmp_seq=3D13 ttl=3D64 time=3D55.0 ms 64 bytes from 192.168.3.23: icmp_seq=3D14 ttl=3D64 time=3D4.93 ms 64 bytes from 192.168.3.23: icmp_seq=3D15 ttl=3D64 time=3D5.98 ms 64 bytes from 192.168.3.23: icmp_seq=3D16 ttl=3D64 time=3D5.41 ms 64 bytes from 192.168.3.23: icmp_seq=3D17 ttl=3D64 time=3D21.0 ms ... Though I think these values are still quite high I can accept that as this = is a virtualized environment. Could someone please explain to me what is going on with the basicfwd and c= lient-server exapmles? According to my understanding each packet should go = through BR as fast as possible, but it seems that the rte_eth_rx_burst retr= ieves packets only when there are at least 2 packets on the RX queue of the= NIC. At least most of the times as there are cases (rarely - according to = my console log) when it can retrieve 1 packet also and sometimes only 3 pac= kets can be retrieved... What is the difference that makes test-pdm working without major delay and = the others don't? Thanks, Sandor