From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qt0-f182.google.com (mail-qt0-f182.google.com [209.85.216.182]) by dpdk.org (Postfix) with ESMTP id 65F5E2B9A for ; Sat, 24 Dec 2016 09:06:10 +0100 (CET) Received: by mail-qt0-f182.google.com with SMTP id c47so277350575qtc.2 for ; Sat, 24 Dec 2016 00:06:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=cFdhLhCv9kN/dpZUcSbt8zoQ5d6mfTCoHTpH2ivT6so=; b=OMV3bl1LKWHPmcVkc2JcYFJhHfnIaufZS0q/sxwLC0zjRpQdVo6sdu8cMpkruvJlYP ZjN1laMZdewzaqfVMAOn1mP1mz86ZK6OpM7m/NYMJ5PNuoVVB4ojmmdhBxLGOiUHqGUI FPX2c7EwsWQa2ZZxi4XGrqakhI5p6GK48x1AFkNRGXh8R+3BM0TkPrsCpGqUD+ncF/EU H0aEktF3my/uKra0w+zoOXuNLbVn7IveFaEuLIAOxgXOI+u3uAP6w4Y7j3RjJfdDGRVR kXZQwsFyTvkHn0SgrM53YvJxQGEu+1wG1WLbw4aWrAl5lK9kYslZMRucWBwexjz9Q/yx lKvA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=cFdhLhCv9kN/dpZUcSbt8zoQ5d6mfTCoHTpH2ivT6so=; b=at6+KpUHfsXA4NxZR0a8BoLnTwXrpG9i1Tr5ka6bUVwmYNVoTDIOr6w5cSB+VdXwGb 4//+t6f9CYyGGkjSeFS6o4P4oPXL++7eiaylJPoCSBdhlV3J7Xt2tYmFU0qJj7i57wRY uS/XF3ZMwnXpjOB+4HJqHO+yltGGnuYlahP6Tu10va57IOWMq7rr+ShVZ67EJdQFyJ1l B2eZAACwNTEqml0RwUZtXrOcncDMpriXWcHZE9uNQ2ya7gdj8pw6z0hFTbJD6EY0vO2e HXjKHwpc5mzBInHH42e877op/D7BoSdoQ80j9L6Y3BGSQyj3/eJY76yjTPBPfwxXZb5H wX/w== X-Gm-Message-State: AIkVDXLshoTet6G2sAt+GB8ltNlYBUpga1pPdAWTMOO0gZFZuAdDp+CmkN1AFHcvWDlPiVhZvC3Bj1p+gDJFQw== X-Received: by 10.200.35.105 with SMTP id b38mr17825877qtb.28.1482566769753; Sat, 24 Dec 2016 00:06:09 -0800 (PST) MIME-Version: 1.0 Received: by 10.140.89.115 with HTTP; Sat, 24 Dec 2016 00:06:08 -0800 (PST) Received: by 10.140.89.115 with HTTP; Sat, 24 Dec 2016 00:06:08 -0800 (PST) In-Reply-To: <88A92D351643BA4CB23E30315517062662F58599@SHSMSX103.ccr.corp.intel.com> References: <88A92D351643BA4CB23E30315517062662F3C939@SHSMSX103.ccr.corp.intel.com> <88A92D351643BA4CB23E30315517062662F3D4FF@SHSMSX103.ccr.corp.intel.com> <88A92D351643BA4CB23E30315517062662F58599@SHSMSX103.ccr.corp.intel.com> From: edgar helmut Date: Sat, 24 Dec 2016 10:06:08 +0200 Message-ID: To: "Hu, Xuekun" Cc: "Wiles, Keith" , users@dpdk.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Subject: Re: [dpdk-users] Dpdk poor performance on virtual machine X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 24 Dec 2016 08:06:10 -0000 I am looking for a mean to measure in and out packets to and from the vm (without asking the vm itself). While pure passthrough doesn't expose an interface to query for in/out pkts the macvtap exposes such an interface. As for the anonymous hugepages I was looking for a more flexible method and I assumed there is no much difference. I will make the test with reserved hugepages. However is there any knowledge about macvtap performance issues when delivering 5-6 gbps? Thanks On 24 Dec 2016 9:06 AM, "Hu, Xuekun" wrote: Now your setup has a new thing, =E2=80=9Cmacvtap=E2=80=9D. I don=E2=80=99t = know what=E2=80=99s the performance of using macvtap. I only know it has much worse perf than the =E2=80=9Creal=E2=80=9D pci pass-through. I also don=E2=80=99t know why you select such config for your setup, anonym= ous huge pages and macvtap. Any specific purpose? I think you should get a baseline first, then to get how much perf dropped if using anonymous hugepages or macvtap=E3=80=82 1. Baseline: real hugepage + real pci pass-through 2. Anon hugepages vs hugepages 3. Real pci pass-through vs. macvtap *From:* edgar helmut [mailto:helmut.edgar100@gmail.com] *Sent:* Saturday, December 24, 2016 3:23 AM *To:* Hu, Xuekun *Cc:* Wiles, Keith ; users@dpdk.org *Subject:* Re: [dpdk-users] Dpdk poor performance on virtual machine Hello, I changed the setup but still performance are poor :( and I need your help to understand the root cause. the setup is (sorry for long description): (test equipment is pktgen using dpdk installed on a second physical machine coonected with 82599 NICs) host: Intel(R) Xeon(R) CPU E5-2640 v4 @ 2.40GHz with single socket , ubuntu 16.04, with 4 hugepages of 1G each. hypervizor (kvm): QEMU emulator version 2.5.0 guest: same cpu as host, created with 3 vcpus, using ubuntu 16.04 dpdk: tried 2.2, 16.04, 16.07, 16.11 - using testpmd and 512 pages of 2M each. guest total memory is 2G and all of it is backed by the host with transparent hugepages (I can see the AnonHugePages consumed at guest creation). This memory includes the 512 hugepages for the testpmd application. I pinned and isolated the guest's vcpus (using kernel option isolcapu), and could see clearly that the isolation functions well. 2 x 82599 NICs connected as passthrough using macvtap interfaces to the guest, so the guest receives and forwards packets from one interface to the second and vice versa. at the guest I bind its interfaces using igb_uio. the testpmd at guest starts dropping packets at about ~800mbps between both ports bi-directional using two vcpus for forwarding (one for the application management and two for forwarding). at 1.2 gbps it drops a lot of packets. the same testpmd configuration on the host (between both 82599 NICs) forwards about 5-6gbps on both ports bi-directional. I assumed that forwarding ~5-6 gbps between two ports should be trivial, so it will be great if someone can share its configuration for a tested setup. Any further idea will be highly appreciated. Thanks. On Sat, Dec 17, 2016 at 2:56 PM edgar helmut wrote: That's what I afraid. In fact i need the host to back the entire guest's memory with hugepages. I will find the way to do that and make the testing again. On 16 Dec 2016 3:14 AM, "Hu, Xuekun" wrote: You said VM=E2=80=99s memory was 6G, while transparent hugepages was only u= sed ~4G (4360192KB). So some were mapped to 4K pages. BTW, the memory used by transparent hugepage is not the hugepage you reserved in kernel boot option. *From:* edgar helmut [mailto:helmut.edgar100@gmail.com] *Sent:* Friday, December 16, 2016 1:24 AM *To:* Hu, Xuekun *Cc:* Wiles, Keith; users@dpdk.org *Subject:* Re: [dpdk-users] Dpdk poor performance on virtual machine in fact the vm was created with 6G RAM, its kernel boot args are defined with 4 hugepages of 1G each, though when starting the vm i noted that anonhugepages increased. The relevant qemu process id is 6074, and the following sums the amount of allocated AnonHugePages: sudo grep -e AnonHugePages /proc/6074/smaps | awk '{ if($2>0) print $2} '|awk '{s+=3D$1} END {print s}' which results with 4360192 so not all the memory is backed with transparent hugepages though it is more than the amount of hugepages the guest supposed to boot with. How can I be sure that the required 4G hugepages are really allocated?, and not, for example, only 2G out of the 4G are allocated (and the rest 2 are mapping of the default 4K)? thanks On Thu, Dec 15, 2016 at 4:33 PM, Hu, Xuekun wrote: Are you sure the anonhugepages size was equal to the total VM's memory size= ? Sometimes, transparent huge page mechanism doesn't grantee the app is using the real huge pages. -----Original Message----- From: users [mailto:users-bounces@dpdk.org] On Behalf Of edgar helmut Sent: Thursday, December 15, 2016 9:32 PM To: Wiles, Keith Cc: users@dpdk.org Subject: Re: [dpdk-users] Dpdk poor performance on virtual machine I have one single socket which is Intel(R) Xeon(R) CPU E5-2640 v4 @ 2.40GHz= . I just made two more steps: 1. setting iommu=3Dpt for better usage of the igb_uio 2. using taskset and isolcpu so now it looks like the relevant dpdk cores use dedicated cores. It improved the performance though I still see significant difference between the vm and the host which I can't fully explain. any further idea? Regards, Edgar On Thu, Dec 15, 2016 at 2:54 PM, Wiles, Keith wrote= : > > > On Dec 15, 2016, at 1:20 AM, edgar helmut > wrote: > > > > Hi. > > Some help is needed to understand performance issue on virtual machine. > > > > Running testpmd over the host functions well (testpmd forwards 10g > between > > two 82599 ports). > > However same application running on a virtual machine over same host > > results with huge degradation in performance. > > The testpmd then is not even able to read 100mbps from nic without drops, > > and from a profile i made it looks like a dpdk application runs more than > > 10 times slower than over host=E2=80=A6 > > Not sure I understand the overall setup, but did you make sure the NIC/PC= I > bus is on the same socket as the VM. If you have multiple sockets on your > platform. If you have to access the NIC across the QPI it could explain > some of the performance drop. Not sure that much drop is this problem. > > > > > Setup is ubuntu 16.04 for host and ubuntu 14.04 for guest. > > Qemu is 2.3.0 (though I tried with a newer as well). > > NICs are connected to guest using pci passthrough, and guest's cpu is set > > as passthrough (same as host). > > On guest start the host allocates transparent hugepages (AnonHugePages) > so > > i assume the guest memory is backed with real hugepages on the host. > > I tried binding with igb_uio and with uio_pci_generic but both results > with > > same performance. > > > > Due to the performance difference i guess i miss something. > > > > Please advise what may i miss here? > > Is this a native penalty of qemu?? > > > > Thanks > > Edgar > > Regards, > Keith > >