From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 67C69A00C5 for ; Thu, 7 May 2020 12:47:48 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 4C4F41DBE0; Thu, 7 May 2020 12:47:48 +0200 (CEST) Received: from mail-wm1-f67.google.com (mail-wm1-f67.google.com [209.85.128.67]) by dpdk.org (Postfix) with ESMTP id 679F21DBCA for ; Thu, 7 May 2020 12:47:47 +0200 (CEST) Received: by mail-wm1-f67.google.com with SMTP id z6so6269485wml.2 for ; Thu, 07 May 2020 03:47:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=AGsae4M9MDXf6Hr/hu321k6kxReWqXP+k4muKpS7dHI=; b=qccdoAfWSJSiJPgtvsqQ/EZQAYyhh+NTHPH7ZiDOjsp8wno6ZHSIMn/Xt+63W1oWKN /hXsm5GBdGv/FfzHlVPmezlcmTsqD8d/hIpNYA4fm6Y9e6o4jNb0VaOSkZW4ZVP7QDdJ rZtmA6YBEDQ8PbT/wHydIsoAtYSqu+3LCiz+82CKxoSpMeVb2KNbpqMzta5tEXfBVXYp tTsKrNeTj5RVgsc8b8jyrnC1aamdHfpmakdH+AFan193V0zHa/nekMdEJxI5dPAGWboE gJt+dJ/zHgBmttutzCEolEwLqjyicNRwiFmU5LtOQ08x8Lvp1UxTTJVfmUu2XPuIns6+ AvwA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=AGsae4M9MDXf6Hr/hu321k6kxReWqXP+k4muKpS7dHI=; b=fxoPhN8DIdS59bl1V6keuSSEiis5x9w3IGZKRxFxHe2Uj3mHZkpudH0D2AK3P8Bi4m G+vSx00f9XojT37s5iw2wTpLl5Y7ckkG9mkuIjwXePuDvY+tgff8tFvuQlCk1ap1IK5p VfO0lkhtEgCV013zhNAAub0JBEnL0rODQ7xl33E8hg7owdkZunyijWYFFRwbSgwLq0xR /gUJM3r5cMzeojwZydHstylkX2pribCZ7AltL06R2dQCdsPSCFeeeQaMuTjF8AhZPAko 7LR7PuGrA1eXx/uWuQHioptG4KtA3NT8FYzcI0kEejl8R3m3HtOcVgLARLutTDvl9DBJ pXTw== X-Gm-Message-State: AGi0Pub02djrun1Q3Esp+DjaRATlT29Yowutzt7XdB+rrJlCWxsYN/6J vzVfFYuZ8FdyZV7FxaPPuUkuLHX8QtLHWETVUvRe3IAHJco= X-Google-Smtp-Source: APiQypJABJYWT87nsZEdR23MHiY/4DvESxZOBx0y10kcXQZm9D9y6dyYShuXKR15M6xEKKeACDgI4Db169O8xAAQgOI= X-Received: by 2002:a1c:7d4b:: with SMTP id y72mr9824705wmc.11.1588848467160; Thu, 07 May 2020 03:47:47 -0700 (PDT) MIME-Version: 1.0 References: <20200506075456.140625fb@hermes.lan> In-Reply-To: <20200506075456.140625fb@hermes.lan> From: Pavel Vajarov Date: Thu, 7 May 2020 13:47:33 +0300 Message-ID: To: Stephen Hemminger Cc: users@dpdk.org Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Subject: Re: [dpdk-users] Peformance troubleshouting of TCP/IP stack over DPDK. X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: users-bounces@dpdk.org Sender: "users" On Wed, May 6, 2020 at 5:55 PM Stephen Hemminger wrote: > On Wed, 6 May 2020 08:14:20 +0300 > Pavel Vajarov wrote: > > > Hi there, > > > > We are trying to compare the performance of DPDK+FreeBSD networking stack > > vs standard Linux kernel and we have problems finding out why the former > is > > slower. The details are below. > > > > There is a project called F-Stack . > > It glues the networking stack from > > FreeBSD 11.01 over DPDK. We made a setup to test the performance of > > transparent > > TCP proxy based on F-Stack and another one running on Standard Linux > > kernel. > > We did the tests on KVM with 2 cores (Intel(R) Xeon(R) Gold 6139 CPU @ > > 2.30GHz) > > and 32GB RAM. 10Gbs NIC was attached in passthrough mode. > > The application level code, the one which handles epoll notifications and > > memcpy data between the sockets, of the both proxy applications is 100% > the > > same. Both proxy applications are single threaded and in all tests we > > pinned the applications on core 1. The interrupts from the network card > > were pinned to the same core 1 for the test with the standard Linux > > application. > > > > Here are the test results: > > 1. The Linux based proxy was able to handle about 1.7-1.8 Gbps before it > > started to throttle the traffic. No visible CPU usage was observed on > core > > 0 during the tests, only core 1, where the application and the IRQs were > > pinned, took the load. > > 2. The DPDK+FreeBSD proxy was able to thandle 700-800 Mbps before it > > started to throttle the traffic. No visible CPU usage was observed on > core > > 0 during the tests only core 1, where the application was pinned, took > the > > load. In some of the latter tests I did some changes to the number of > read > > packets in one call from the network card and the number of handled > events > > in one call to epoll. With these changes I was able to increase the > > throughput > > to 900-1000 Mbps but couldn't increase it more. > > 3. We did another test with the DPDK+FreeBSD proxy just to give us some > > more info about the problem. We disabled the TCP proxy functionality and > > let the packets be simply ip forwarded by the FreeBSD stack. In this test > > we reached up to 5Gbps without being able to throttle the traffic. We > just > > don't have more traffic to redirect there at the moment. So the bottlneck > > seem to be either in the upper level of the network stack or in the > > application > > code. > > > > There is a huawei switch which redirects the traffic to this server. It > > regularly > > sends arping and if the server doesn't respond it stops the redirection. > > So we assumed that when the redirection stops it's because the server > > throttles the traffic and drops packets and can't respond to the arping > > because > > of the packets drop. > > > > The whole application can be very roughly represented in the following > way: > > - Write pending outgoing packets to the network card > > - Read incoming packets from the network card > > - Push the incoming packets to the FreeBSD stack > > - Call epoll_wait/kevent without waiting > > - Handle the events > > - loop from the beginning > > According to the performance profiling that we did, aside from packet > > processing, > > about 25-30% of the application time seems to be spent in the > > epoll_wait/kevent > > even though the `timeout` parameter of this call is set to 0 i.e. > > it shouldn't block waiting for events if there is none. > > > > I can give you much more details and code for everything, if needed. > > > > My questions are: > > 1. Does somebody have observations or educated guesses about what amount > of > > traffic should I expect the DPDK + FreeBSD stack + kevent to process in > the > > above > > scenario? Are the numbers low or expected? > > We've expected to see better performance than the standard Linux kernel > one > > but > > so far we can't get this performance. > > 2. Do you think the diffrence comes because of the time spending handling > > packets > > and handling epoll in both of the tests? What do I mean. For the standard > > Linux tests > > the interrupts handling has higher priority than the epoll handling and > > thus the application > > can spend much more time handling packets and processing them in the > kernel > > than > > handling epoll events in the user space. For the DPDK+FreeBSD case the > time > > for > > handling packets and the time for processing epolls is kind of equal. I > > think, that this was > > the reason why we were able to get more performance increasing the number > > of read > > packets at one go and decreasing the epoll events. However, we couldn't > > increase the > > throughput enough with these tweaks. > > 3. Can you suggest something else that we can test/measure/profile to get > > better idea > > what exactly is happening here and to improve the performance more? > > > > Any help is appreciated! > > > > Thanks in advance, > > Pavel. > > First off, if you are testing on KVM, are you using PCI pass thru or SR-IOV > to make the device available to the guest directly. The default mode uses > a Linux bridge, and this results in multiple copies and context switches. > You end up testing Linux bridge and virtio performance, not TCP. > > To get full speed with TCP and most software stacks you need TCP > segmentation > offload. > > Also software queue discipline, kernel version, and TCP congestion control > can have a big role in your result. > Hi, Thanks for the response. We did the tests on Ubuntu 18.04.4 LTS (GNU/Linux 4.15.0-96-generic x86_64). The NIC was given to the guest using SR-IOV. The TCP segmentation offload was enabled for both tests (standard Linux and DPDK+FreeBSD). The congestion control algorithm for both tests was 'cubic'. What do you mean by 'software queue discipline'? Regards, Pavel.