From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 43B6DA034F for ; Wed, 6 May 2020 16:55:07 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id A04241DA25; Wed, 6 May 2020 16:55:06 +0200 (CEST) Received: from mail-pl1-f193.google.com (mail-pl1-f193.google.com [209.85.214.193]) by dpdk.org (Postfix) with ESMTP id 661471D9E8 for ; Wed, 6 May 2020 16:55:05 +0200 (CEST) Received: by mail-pl1-f193.google.com with SMTP id t7so596009plr.0 for ; Wed, 06 May 2020 07:55:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=6rWB2VlzK3w691h8cv1dyFZvwL4TKuGQVrM76GXrnJQ=; b=olDTiJ8T5jWWZLa41LkNX/6QOu5p/uxVn8jDX/FUq01NuDRMgl5wS/803CMq1Icriw VqJ0aB4+s4CUqO5B+5oJZ4nGBtoz41hKeEO9cGWmQak9E3X0BIIX2IJj05LKzzIhm0Ww 6VkD/NRDAjh/sj7qKMws539Xz2Qo5LiRHeM/ZTPqGWdITZ2IZNqxX6nCrQ9L7/O/bvgh zHeeSBKKLxkVaedO86TWh8fzX3dqMqK6pkFwm4NIjJgs+B2b0s0CHy3tpJnMQjYm9xk6 jZL1yMN7GLSQ67OB7s0A1pFm3gpb1qtx+d9ZwXRNrbwZ+JEBFgYC8ehO5Ze8+gfyVObh Lwkw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=6rWB2VlzK3w691h8cv1dyFZvwL4TKuGQVrM76GXrnJQ=; b=abYeAE0lQ1PDEz0NUqVR35MqI/UY/uPDZIaI+K8A38V9eCDQmM1I/1Y7ePPDCkVKfJ 8dWnUxpB1GJA9S14oU6azcjmvJ0eD2KEqXn9loLVXy4KGbKK15TG5w28GgMb8ZWZDSis bGB0oB26RAIlLprE8tskaa4qRrFp7H/53raMsTifL7J07W8VJ7z29ciw8lZlOjoVFzm6 75qpheaNI3NV7LHMOOQ7td+uTlfflR71+GYV4w3Hn6w4IZhopniNeIyQuyY4eZgtFFtX PUo89ltJ24hyoqXoBO/Y1EBCHe/MPmS4dnf8OpzWVCKjIQ/za8rI9ZuOBKD1REvRrHA9 2tfQ== X-Gm-Message-State: AGi0PuYcbUSw8Uko6IdlwSJKbYrsMV0dRdmmPaDNmCkN8+bRLmCgpckS u5nxL6nNBrqspZOtmMZ0kFX5qmz8QUjFhQ== X-Google-Smtp-Source: APiQypKsr2Jz7qaoakCZLMdV64PzlJMWSymdneSy4EQLEyuqEkLm7sOxXRPN9mj+F8hpHCMAVihWuQ== X-Received: by 2002:a17:90b:80a:: with SMTP id bk10mr10255386pjb.135.1588776904407; Wed, 06 May 2020 07:55:04 -0700 (PDT) Received: from hermes.lan (204-195-22-127.wavecable.com. [204.195.22.127]) by smtp.gmail.com with ESMTPSA id e4sm1794003pge.45.2020.05.06.07.55.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 May 2020 07:55:04 -0700 (PDT) Date: Wed, 6 May 2020 07:54:56 -0700 From: Stephen Hemminger To: Pavel Vajarov Cc: users@dpdk.org Message-ID: <20200506075456.140625fb@hermes.lan> In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-users] Peformance troubleshouting of TCP/IP stack over DPDK. X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: users-bounces@dpdk.org Sender: "users" On Wed, 6 May 2020 08:14:20 +0300 Pavel Vajarov wrote: > Hi there, > > We are trying to compare the performance of DPDK+FreeBSD networking stack > vs standard Linux kernel and we have problems finding out why the former is > slower. The details are below. > > There is a project called F-Stack . > It glues the networking stack from > FreeBSD 11.01 over DPDK. We made a setup to test the performance of > transparent > TCP proxy based on F-Stack and another one running on Standard Linux > kernel. > We did the tests on KVM with 2 cores (Intel(R) Xeon(R) Gold 6139 CPU @ > 2.30GHz) > and 32GB RAM. 10Gbs NIC was attached in passthrough mode. > The application level code, the one which handles epoll notifications and > memcpy data between the sockets, of the both proxy applications is 100% the > same. Both proxy applications are single threaded and in all tests we > pinned the applications on core 1. The interrupts from the network card > were pinned to the same core 1 for the test with the standard Linux > application. > > Here are the test results: > 1. The Linux based proxy was able to handle about 1.7-1.8 Gbps before it > started to throttle the traffic. No visible CPU usage was observed on core > 0 during the tests, only core 1, where the application and the IRQs were > pinned, took the load. > 2. The DPDK+FreeBSD proxy was able to thandle 700-800 Mbps before it > started to throttle the traffic. No visible CPU usage was observed on core > 0 during the tests only core 1, where the application was pinned, took the > load. In some of the latter tests I did some changes to the number of read > packets in one call from the network card and the number of handled events > in one call to epoll. With these changes I was able to increase the > throughput > to 900-1000 Mbps but couldn't increase it more. > 3. We did another test with the DPDK+FreeBSD proxy just to give us some > more info about the problem. We disabled the TCP proxy functionality and > let the packets be simply ip forwarded by the FreeBSD stack. In this test > we reached up to 5Gbps without being able to throttle the traffic. We just > don't have more traffic to redirect there at the moment. So the bottlneck > seem to be either in the upper level of the network stack or in the > application > code. > > There is a huawei switch which redirects the traffic to this server. It > regularly > sends arping and if the server doesn't respond it stops the redirection. > So we assumed that when the redirection stops it's because the server > throttles the traffic and drops packets and can't respond to the arping > because > of the packets drop. > > The whole application can be very roughly represented in the following way: > - Write pending outgoing packets to the network card > - Read incoming packets from the network card > - Push the incoming packets to the FreeBSD stack > - Call epoll_wait/kevent without waiting > - Handle the events > - loop from the beginning > According to the performance profiling that we did, aside from packet > processing, > about 25-30% of the application time seems to be spent in the > epoll_wait/kevent > even though the `timeout` parameter of this call is set to 0 i.e. > it shouldn't block waiting for events if there is none. > > I can give you much more details and code for everything, if needed. > > My questions are: > 1. Does somebody have observations or educated guesses about what amount of > traffic should I expect the DPDK + FreeBSD stack + kevent to process in the > above > scenario? Are the numbers low or expected? > We've expected to see better performance than the standard Linux kernel one > but > so far we can't get this performance. > 2. Do you think the diffrence comes because of the time spending handling > packets > and handling epoll in both of the tests? What do I mean. For the standard > Linux tests > the interrupts handling has higher priority than the epoll handling and > thus the application > can spend much more time handling packets and processing them in the kernel > than > handling epoll events in the user space. For the DPDK+FreeBSD case the time > for > handling packets and the time for processing epolls is kind of equal. I > think, that this was > the reason why we were able to get more performance increasing the number > of read > packets at one go and decreasing the epoll events. However, we couldn't > increase the > throughput enough with these tweaks. > 3. Can you suggest something else that we can test/measure/profile to get > better idea > what exactly is happening here and to improve the performance more? > > Any help is appreciated! > > Thanks in advance, > Pavel. First off, if you are testing on KVM, are you using PCI pass thru or SR-IOV to make the device available to the guest directly. The default mode uses a Linux bridge, and this results in multiple copies and context switches. You end up testing Linux bridge and virtio performance, not TCP. To get full speed with TCP and most software stacks you need TCP segmentation offload. Also software queue discipline, kernel version, and TCP congestion control can have a big role in your result.